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Introduction 


Karsten Schmidtke-Bode 
Leipzig University and Friedrich Schiller University Jena 


The present volume addresses a foundational issue in linguistic typology and 
language science more generally. It concerns the kinds of explanation that ty- 
pologists provide for the cross-linguistic generalizations they uncover, i.e. for 
so-called universals of language. The universals at issue here are usually proba- 
bilistic statements about the distribution of specific structures, such as the classic 
Greenbergian generalizations about word order and morphological markedness 
patterns. Some examples are given in (1)-(4) below: 


(1) With overwhelmingly greater than chance frequency, languages with 
normal SOV order are postpositional. (Greenberg 1963: 62) 


(2 A language never has more gender categories in nonsingular numbers 
than in the singular. (Greenberg 1963: 75) 


(3) Ifalanguage uses an overt inflection for the singular, then it also uses an 
overt inflection for the plural. (Croft 2003: 89, based on Greenberg 1966: 
28) 


(4) Intheir historical evolution, languages are more likely to maintain and 
develop non-ergative case-marking systems (treating S and A alike) than 
ergative case-marking systems (splitting S and A). (Bickel et al. 2015: 5) 


As can be seen from these examples, cross-linguistic generalizations of this 
kind may be formulated in terms of preferred types in synchronic samples or in 
terms of higher transitional probabilities for these types in diachronic change 
(see also Greenberg 1978; Maslova 2000; Cysouw 2011; Bickel 2013 for discussion 
of the latter approach). But this is, strictly speaking, independent of the ques- 
tion we are primarily concerned with here, namely how to best account for such 
generalizations once they have been established. 
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The most widespread typological approach to explanation is grounded in func- 
tional properties of the preferred structural types: For example, typical corre- 
lations in the ordering of different types of phrases (e.g. object-verb and NP- 
postposition) have been argued to allow efficient online processing (e.g. Hawkins 
1994; 2004). Markedness patterns in morphology (e.g. the distribution of zero ex- 
pression in case, number or person systems) have been attributed to economy, i.e. 
the desire to leave the most frequent and hence most predictable constellations 
unexpressed, or rather to a competition between economy and the motivation 
to code all semantic distinctions explicitly (e.g. Haiman 1983; Comrie 1989; Ais- 
sen 2003; Croft 2003; Haspelmath 2008, among many others). The general idea 
behind this approach is thus that speech communities around the world are sub- 
ject to the same kinds of cognitive and communicative pressures, and that the 
languages they speak tend to develop structures that respond to these pressures 
accordingly, or, as Bickel (2014: 118) puts it, “in such a way as to fit into the nat- 
ural and social eco-system of speakers: that they are easy to process, that they 
map easily to patterns in nonlinguistic cognition, and that they match the social 
and communicative needs of speakers? 

There is a clear parallel to evolutionary biology here, in that languages are said 
to converge on similar structural solutions under the same functional pressures, 
just like unrelated species tend to develop similar morphological shapes in order 
to be optimally adapted to the specific environment they co-inhabit (Deacon 1997; 
Caldwell 2008; Evans & Levinson 2009; Givón 2010). When applied to language, 
this line of explanation at least implicitly invokes what is known as “attractor 
states", i.e. patterns of structural organization that languages are drawn into in 
their course of development. For this reason, one could also speak of a RESULT- 
ORIENTED approach to explanation. 

There is, however, another way of looking at the same patterns, one that redi- 
rects attention from the functional properties to the diachronic origins of the 
linguistic structures in question. On this view, many universal tendencies of or- 
der and coding are seen as by-products, as it were, of recurrent processes of 
morphosyntactic change, notably grammaticalization, but without being adap- 
tive in the above sense: There is no principled convergence on similar struc- 
tural traits because these traits might be beneficial from the perspective of pro- 
cessing, iconicity or economical communicative behaviour. Instead, the current 


“The term attractor state (or basin of attraction) is adopted from the theory of complex dynamic 
systems (e.g. Cooper 1999; Howe & Lewis 2005; Holland 2006), which has become increasingly 
popular as a way of viewing linguistic systems as well (see Beckner et al. 2009 and Port 2009 
for general overviews, and Haig 2018 or Nichols 2018 for very recent applications to typological 
data). 
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synchronic distributions are argued to be long-term reflections of individual di- 
achronic trajectories, in particular the diachronic sources from which the struc- 
tures in question originate. Givón (1984) and Aristar (1991), for example, sug- 
gested that certain word-order correlations may simply be a consequence of a 
given ordering pair (e.g. Gen-N & Rel-N, or V-O & Aux-V) being directly re- 
lated diachronically: Auxiliaries normally grammaticalize from main verbs that 
take other verbs as complements, and since these complements follow the verb 
in VO languages, they also follow the auxiliary in the resulting Aux-V construc- 
tion; the mirror-image pattern holds for OV languages (see also Lehmann 1986: 
12-13). If this line of reasoning extends to most other word-order pairs, there 
is no need to motivate the synchronic correlations in functional-adaptive terms, 
e.g. by saying that the correlations arise in order to facilitate efficient sentence 
processing. 

In the domain of morphology, Garrett (1990) argued that patterns in case mark- 
ing, specifically of differential ergative marking, are exhaustively explained by 
the properties of the source of the ergative marker: When ergative case arises 
from the reanalysis of instrumental case, the original characteristics of the lat- 
ter, such as a restriction to inanimate referents, are directly bequeathed to the 
former. The result is a pattern in which animate A-arguments are left unmarked, 
but since this is a direct "persistence effect" (Hopper 1991) of the history of the 
ergative marker, there is again no need for an additional functional-adaptive ex- 
planation in terms of other principles, such as a drive for economical coding pat- 
terns. Rather than being result-oriented, then, this way of explaining universals 
can be characterized as SOURCE-ORIENTED. 

Such source-oriented explanations thus move away from attractor states of 
grammatical organization and often emphasize the importance of “attractor tra- 
jectories" instead (Bybee & Beckner 2015: 185): In some domains of grammar, 
the patterns of reanalysis and ensuing grammaticalization are so strikingly sim- 
ilar across the world's languages that it is not surprising that they yield similar 
outcomes, such as strong correlations between V-O & Aux-V or V-O & P-NP 
ordering. In other cases, it is argued that many individual, and partly very differ- 
ent, diachronies are capable of producing a uniform result, but without any con- 
sistent functional force driving these trajectories. Cristofaro (2017), for instance, 
claims that this is the case for plural markers: An initial system without number 
marking can develop an overt plural morpheme from many different sources - 
usually by contextual reanalysis - and thus ultimately come to contrast a zero 
singular with an overt plural, but these developments are neither triggered nor 
further orchestrated by a need for economical coding: They do not happen to 
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keep the (generally more frequent) singular unmarked and the (generally less 
frequent) plural overtly signalled. 

In other words, whether the individual diachronic trajectories are highly sim- 
ilar or rather diverse, the premise of the source-oriented approach is that they 
can scale up to produce a predominant structural pattern in synchronic samples. 
Hence they obviate the need for highly general functional principles tying these 
patterns together. 

While the source-oriented approach was still a more marginal position in 
previous volumes on explaining language universals (e.g. Hawkins 1988a; Good 
2008), it has gained considerable ground over the last decade, notably in a series 
of articles by Cristofaro (e.g. Cristofaro 2012; 2014; 2017) but also in other pub- 
lications (e.g. Anderson 2016; Creissels 2008; Gildea & Zúñiga 2016). Moreover, 
while the basic thrust ofthe two explanatory approaches is straightforward, clari- 
fication is needed on a number of - equally fundamental - details. After all, both 
approaches are functionalist in nature, as they rely on domain-general mecha- 
nisms (Bybee 2010) to explain the emergence of language structure and linguistic 
universals; and in both approaches, these mechanisms constrain how languages 
"evolve into the variation states to which implicational and distributional univer- 
sals refer" (Hawkins 1988b: 18). But as Plank (2007: 51) notes, “what is supposed 
to be the essence and force of diachronic constraints would merit livelier dis- 
cussion. It is the goal of the present book to offer precisely a discussion of this 
kind. 

The volume begins with a programmatic paper by Martin Haspelmath on 
what it means to explain a universal in diachronic terms. He aims to clarify how 
diachrony is involved in result-oriented and source-oriented accounts, respec- 
tively, and thus lays out a general conceptual framework for the explanation 
of universals. At the same time, Haspelmath opens the floor for debating the 
strengths and weaknesses of the two explanatory accounts at issue here. His 
own position is that, in many cases, current source-oriented explanations are ill- 
equipped to truly explain the phenomena they intend to account for, and hence 
cannot replace result-oriented motivations. Haspelmath's arguments for this po- 
sition, as well as his terminological proposals, provide a frame of reference to 
which all other contributions respond in one way or another. 

The lead article is followed by two endorsements of source-oriented explana- 
tions, articulated by Sonia Cristofaro and Jeremy Collins, respectively. They 
both describe the approach in widely accessible terms, allowing also readers out- 
side of linguistic typology to appreciate the general argument as well as the 
specific examples discussed. The phenomena themselves involve domains that 
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are particularly well-known for being explained in functional-adaptive terms, 
namely differential argument marking, number marking and word-order corre- 
lations, and these are all argued to be best captured by persistence effects from 
their respective diachronic origins. 

We then proceed to papers that allow for progressively more room for func- 
tional-adaptive motivations and, importantly, for methodological discussions on 
how to obtain evidence for such pressures. Accordingly, all of these papers ad- 
duce novel empirical data and discuss them in light of the present debate. 

Matthew Dryer's paper is an immediate follow-up on Collins's discussion of 
word-order correlations. On the one hand, Dryer argues that the various corre- 
lates of adposition-noun ordering (e.g. O-V and NP-P, and Gen-N and NP-P) 
are, indeed, best accounted for in source-oriented terms. In particular, only this 
approach proves capable of explaining the occurrence (and the individual seman- 
tic types) of both prepositions and postpositions in SVO languages. On the other 
hand, however, Dryer contends that there are some significant correlations for 
which a source-based account either fails to offer an explanation or else makes 
the opposite prediction of the patterns we find synchronically. Dryer concludes, 
therefore, that neither a purely source-based nor a purely result-based explana- 
tion is sufficient to deal with word-order correlations. 

In a similar fashion to Dryer's paper, Holger Diessel's article demonstrates 
that different aspects of the same grammatical domain - in this case adverbial 
clause combinations — are amenable to different types of explanation. Diessel fo- 
cuses specifically on the structure and development of preposed adverbial clauses 
and argues that some of their typological characteristics, notably the properties 
of their subordinating morphemes, receive a satisfactory explanation in terms 
of the respective source construction(s), thereby supplanting earlier processing- 
based explanations. On the other hand, he proposes that the position of adverbial 
constructions (in general) is clearly subject to a number of functional-adaptive 
pressures, and that these may already have affected the diachronic sources from 
which the current preposed adverbial clauses have grammaticalized. 

Karsten Schmidtke-Bode offers a review of John Hawkins's (2004; 2014) re- 
search programme of "processing typology”, examining the plausibility of Haw- 
kins's functional-adaptive ideas in diachronic perspective. On a theoretical level, 
it is argued that a predilection for efficient information processing is operative 
mostly at the diffusion stage of language change, regardless of the source from 
which the respective constructions originate. On a methodological level, the pa- 
per proposes that the cross-linguistic predictions of Hawkins's programme can 
be tested more rigorously than hitherto by combining static and dynamic statis- 
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tical models of large typological data sets; this is demonstrated in a case study 
on the distribution of article morphemes in VO- and OV-languages, respectively. 

An important methodological point is also made by Ilja A. SerZant, who claims 
that certain functional-adaptive pressures may not actually surface in standard 
typological analysis because they are weak forces, clearly at work but also eas- 
ily overridden by other, language-specific factors. Because of their weak nature, 
they may not be directly visible anymore in a synchronic type, but they can 
be detected in qualitative data from transition phases. Based on diachronic data 
from Russian, SerZant shows how the development of differential object marking 
was crucially influenced by considerations of ambiguity avoidance (and hence a 
classic functional-adaptive motivation), over and above the constraints inherited 
from the source construction. In the absence of such longitudinal data, transition 
phases can be identified on the basis of synchronic variability, and SerZant shows 
that a wide variety of languages currently exhibit variation in differential object 
marking that mirrors the diachronic findings from Russian, and that is not pre- 
dictable from the source meaning of the marker in question. 

Susanne Maria Michaelis adds another source of data to the debate at hand. 
She argues that creole languages provide a unique window onto the relation- 
ship between synchronic grammatical patterns and their diachronic trajectories, 
as the latter are often relatively recent and also accelerated when compared to 
normal rates of grammatical change. The developments are, consequently, more 
directly accessible and less opaque than in many other cases. By inspecting creole 
data on possessive forms in attributive and referential function (e.g. your versus 
yours), Michaelis finds evidence for the development of the same kinds of coding 
asymmetries that this domain shows in non-contact languages around the world. 
She proposes that the data are indicative of result-oriented forces that drive di- 
verse diachronic pathways towards the same synchronic outcome. This stance 
contrasts most explicitly with Cristofaro's, who interprets such situations in ex- 
actly the opposite way (i.e. as providing evidence against a unifying functional 
explanation). 

Natalia Levshina, finally, adopts an entirely different methodological approach 
to illuminate the present discussion: In her paper, she showcases the paradigm of 
artificial language learning, which can be employed to inspect whether users of 
such newly acquired languages develop performance biases that are in keeping 
with hypothesized functional principles, such as an increasingly efficient distri- 
bution of morphological marking. Her case study clearly demonstrates such bi- 
ases and discusses where they may ultimately come from, i.e. how they fit into 
the new conceptual framework of constraints offered by Haspelmath's position 


paper. 


viii 


1 Introduction 


The volume is rounded off by a brief epilogue in which Karsten Schmidtke- 
Bode and Eitan Grossman summarize and further contextualize the arguments 
put forward by the contributors. 

Overall, the purpose of the present book is to provide a state-of-the-art over- 
view of the general tension between source- and result-oriented explanations in 
linguistic typology, and specifically of the kinds of arguments and data sources 
that are (or can be) brought to bear on the issue. It should be made clear from 
the outset that the two types of explanation are framed as antagonistic here even 
though in most cases, an element of both will be needed in order to fully account 
for a given grammatical domain. As we emphasize in the epilogue, the diachronic 
source of a grammatical construction certainly constrains its further develop- 
ment, but the major issue at stake here is the extent to which result-oriented, 
functional-adaptive motivations enter these developments as well. At the end of 
the day, universals of language structure will thus differ in the degree to which 
they are shaped by such adaptive pressures. 
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Chapter 1 


Can cross-linguistic regularities be 
explained by constraints on change? 


Martin Haspelmath 
MPI-SHH Jena & Leipzig University 


This paper addresses a recent trend in the study of language variation and univer- 
sals, namely to attribute cross-linguistic patterns to diachrony, rather than to other 
causal factors. This is an interesting suggestion, and I try to make the basic con- 
cepts clearer, by distinguishing clearly between language-particular regularities, 
universal tendencies, and mere recurrent patterns, as well as three kinds of causal 
factors (preferences, constraints, restrictions). I make four claims: (i) Explanations 
may involve diachrony in different ways; (ii) for causal explanations of universal 
tendencies, one needs to invoke mutational constraints (change constraints); (iii) 
in addition to mutational constraints, we need functional-adaptive constraints as 
well, as is clear from cases of multi-convergence; and (iv) successful functional- 
adaptive explanations do not depend on understanding the precise pathways of 
change. 


1 Language universals: Constraints on cross-linguistic 
distributions as explananda 


Since Greenberg (1963), it has been widely recognized that comparison of lan- 
guages with world-wide scope can give us not only taxonomies (as in earlier 
typology, e.g. von Schlegel 1808; Schleicher 1850: 5-10; Sapir 1921), but intrigu- 
ing limits on cross-linguistic distributions: Especially when one looks at several 
parameters simultaneously, not all logically possible types are attested, or some 
types are far more common and others far less common than would be expected 
by chance. We would like to know why - or in other words, we are looking for 
causal explanations. 
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Since at least Chomsky (1981), many generative grammarians have also been 
interested in cross-linguistic regularities, and have often interpreted them as fol- 
lowing from innate principles of Universal Grammar (UG) and their paramet- 
ric variation. Others have tended to prefer functional explanations of universals 
(e.g. Comrie 1989; Stassen 1985; Dixon 1994; Dik 1997; Hawkins 2014), but these 
authors have likewise appealed primarily to general principles of language and 
sometimes have even adopted the term "universal grammar" (Keenan & Comrie 
1977; Foley & Van Valin 1984; Stassen 1985). 

In contrast to these two dominant approaches of the 1970s-1990s, there is an 
alternative view, according to which the explanation for universals of language 
structure comes from diachrony. The first well-known author in this tradition 
is Greenberg (1969), who stated that "[s]ynchronic regularities are merely the 
consequence of [diachronic] forces" (1969: 186). A straightforward example of 
the explanatory role of diachrony is the generalization that in languages with 
prepositions, the possessor generally follows the possessed nounpossessive con- 
struction, while in languages with postpositions, it generally precedes it (Green- 
berg's (1963) Universal 2; Dryer 1992). This can be explained on the basis of the 
diachronic regularity that new adpositions generally arise from possessed nouns 
in processes of grammaticalization (Lehmann 2015[1982]: 83.4.1; Bybee 1988: 353- 
354; Collins 2019; Dryer 2019 [both in this volume]). For example, English because 
(of) comes from by - cause (of). Since the order of the elements remains stable in 
grammaticalization, we have an explanation for the fact that the possessed noun 
and the adposition tend to occur in the same position in languages. 

The view that the explanation of language universals comes (at least some- 
times) from diachrony has apparently been gaining ground over the last decade 
and a half. The early papers by Greenberg (1969; 1978) and Bybee (1988) repre- 
sented minority views (though Givón 1979 and Lehmann 2015[1982] discussed 
diachronic change extensively and contributed to giving it a prominent place in 
functional-typological linguistics). Prominent papers in this vein in more recent 
years are Aristar (1991), Anderson (2005; 2008; 2016), Cristofaro (2012; 2013; 2014), 
Creissels (2008), Gildea & Zúñiga (2016), and in phonology, Blevins (2004) is a 
book-length study that adopts a similar approach (see also Blevins 2006). The 
following are a few key quotations from some of these papers (and from some 
others): 


(1) a. “The question for typology is perhaps not what kinds of system are 
possible, but what kinds of change are possible? (Timberlake 2003: 
195) 


1 Can cross-linguistic regularities be explained by constraints on change? 


b. “recurrent synchronic sound patterns are a direct reflection of their 
diachronic origins, and, more specifically, ... regular phonetically 
based sound change is the common source of recurrent sound 
patterns" (Blevins 2006: 119-120) 


c. "statistical universals are not really synchronic in nature, but are 
rather the result of underlying diachronic mechanisms that cause 
languages to change in preferred or “natural ways" (Bickel et al. 2015: 
29) 

d. "there are no (or at least very few) substantive universals of language, 
and the regularities arise from common paths of diachronic change 
having their basis in factors outside of the defining properties of the 
set of cognitively accessible grammars" (Anderson 2016: 11) 


This paper has two major goals: First, I would like to contribute to conceptual 
clarification, sorting out what kinds of claims have been made and what terms 
have been used for which kinds of phenomena (82). 

Second, I argue that there are two ways in which diachrony and universals may 
interact: Some cross-linguistic generalizations are due to change constraints, as 
envisaged by the authors in (1), but others are due to functional-adaptive con- 
straints. More specifically, I want to make four points: 


e The notion of “diachronic explanation” is too vague, because explanations 
may involve diachrony in rather different ways ($3). 


* Universal tendencies cannot be explained by common pathways of change, 
only by change constraints, or what I call mutational constraints (84). 


e Multi-convergence clearly shows that functional-adaptive constraints are 
needed in order to explain at least some cross-linguistic regularities ($5). 


e Functional-adaptive explanations do not depend on understanding the path- 
ways of change, though knowing about the pathways illuminates the ex- 
planations (86). 


Before arguing for these four points, I will discuss some technical terms in 
the next section, because there is often confusion between terms for language- 
particular regularities ($2.1), cross-linguistic regularities ($2.2), and causal factors 


(§2.3). 
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2 Regularities and causal factors: Concepts and technical 
terms 


General terms such as restriction, constraint, preference, tendency, bias, and moti- 
vation have been used in diverse and sometimes confusing ways by linguists. This 
section clarifies how these terms are used in the present paper, noting along the 
way what other meanings some of them have been given and what other terms 
have been used for (roughly) the same concepts. I distinguish between terms for 
regularities and terms for causal factors, and within the terms for regularities, I 
distinguish between language-particular and cross-linguistic regularities. 


2.1 Language-particular regularities 


Regularities within a particular language can concern language use or the con- 
ventional language system. Regularities of language use are increasingly studied 
by corpus linguistics, and they are often thought to be at the root of system regu- 
larities, especially in what is often called a “usage-based” view (Bybee 2010). How- 
ever, regularities of use and system regularities are conceptually different, and 
linguists normally distinguish clearly between parole (language use) and langue 
(language system). In what follows, I focus on the systems of linguistic conven- 
tions. 

For regularities within language systems, linguists normally use the general 
terms rule and construction (or schema). In addition, descriptive linguists use 
many other well-established class (or category) terms like clause, noun phrase, 
suffix, dative case, or terms for relations between constructions such as alterna- 
tion or derivation. All of these relate to systems of particular languages. 

The term constraint is sometimes applied to language-particular regularities, 
e.g. in constraint-based formalisms such as HPSG, and optimality theory also 
uses constraints for language-particular regularities. However, I will use this 
term exclusively for causal factors, as explained in 82.3 below. 

Language-particular regularities can also be seen as "explanations", at least in 
the weak sense that they answer why-questions about lower-level regularities 
(“Why is there a Dative case on the object of this sentence? Because the verb's 
valency requires a Dative”). Statements of rules or constructions may thus be 
called “descriptive explanations" if one wishes. In this paper, however, I focus on 
causal explanations that help us explain the conventional systems of languages 
themselves. 
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2.2 Cross-linguistic regularities: Recurrent patterns and universal 
tendencies 


Cross-linguistic regularities are typically generalizations over language-partic- 
ular regularities,! and I will distinguish two kinds of regularities here. On the 
one hand, similar phenomena may be found in different parts of the world, e.g. 
ejective consonants, or high vowel epenthesis, or optative mood forms, or func- 
tive markers (Creissels 2014). These are called RECURRENT PATTERNS. On the other 
hand, some regularities are so strong that we call them UNIVERSALS, because they 
occur with much greater than chance frequency. I also often use the term UNI- 
VERSAL TENDENCIES, because there is no claim that there are no exceptions.? 

Recurrent patterns are not accidental similarities, in the sense that there must 
be something in the human condition that makes it possible for very similar 
linguistic categories to appear independently in languages that have no historical 
connection. However, the discovery of a recurrent pattern does not imply a claim 
about further languages. 

By contrast, the discovery of a universal implies a claim about all other lan- 
guages: If a universal holds (i.e. is found with much greater than chance fre- 
quency in a reasonably representative sample), it is claimed that it also holds 
in any other representative sample. Thus, claims of universal tendencies can be 
tested by examining data from the world's languages, while claims of recurrent 
patterns can only be strengthened by additional further observations, but neither 
confirmed nor disproven by additional data. 

Universal tendencies need to be distinguished, in particular, from family-spe- 
cific or region-specific trends, so they need to be based on a world-wide sample. 
A well-known example is the finding that in all major world regions, languages 
with OV order tend to have postpositions, and languages with VO order tend 
to have prepositions (Greenberg 1963: Universal 2; Dryer 1992: 83), even though 
many languages are exceptions. Another universal tendency is the limitation of 
nominal suppletion to the most frequent nouns (Vafaeian 2013), even though 
many languages do not have nominal suppletion at all. We may even identify 
universal tendencies within patterns that are quite rare, e.g. universals of infixa- 
tion (Yu 2007), because universal tendencies can be implicational (“If a language 
has infixation, then...”). 


!However, comparative corpus linguistics studies comparable corpora of language use, so there 
is no necessary connection between cross-linguistic comparison and the study of systems (as 
opposed to use). 

? Another term for a cross-linguistic distribution is Bickel's (2013) family bias, which means 
‘preponderance within a family’. Note that this use of bias is quite different from the more 
common use as in cognitive bias (e.g. Tversky & Kahneman 1974); a term like family tendency 
would probably be more transparent. 
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Recurrent patterns, by contrast, are not associated with any kind of global 
claim, so they could be called frequent patterns, or sporadic patterns, depending 
on one's subjective assessment of their frequency. They are no doubt important 
for a complete account of human language, but they will be left aside in what 
follows, as it is not clear what causal factors might illuminate them. 


2.3 Causal factors: Preferences, constraints, restrictions 


In addition to documenting language-particular systems and cross-linguistic dis- 
tributions, we also want to know what might explain the distributions in causal 
terms. The explanatory devices are called causal factors, or (system-external) mo- 
tivations, or constraints. Especially the latter term is short and relatively clear, so 
I will use it as the default term for a causal factor. (Two other terms that are used 
commonly as well, especially outside core linguistics, are force and pressure. It 
seems that all these terms are basically synonymous.) 

If a constraint is very strong, it can also be called restriction, and if it is weaker, 
it can be called preference? This seems to be in line with much current usage in 
linguistics. There is thus no objective difference between restrictions, constraints 
and preferences, and we could use one of the three terms for all types of con- 
straints. (This situation is similar to the cases of sporadic and frequent patterns, 
which are subjective sub-cases of recurrent patterns.) 

Depending on the way in which they affect cross-linguistic distributions, here 
I distinguish four types of constraints (or restrictions, or preferences), which can 
be briefly characterized as in (2). 


(2 a. functional-adaptive constraints: what facilitates communication 
(including processing) for speakers and hearers 


b. representational constraints: what is innately preferred or necessary in 
the cognitive representation of language 

c. mutational constraints: what is preferred or necessary in language 
change (= change constraints) 


d. acquisitional constraints: what is preferred or necessary in acquisition 


by children 


? Another term for system-external causal factors is bias, which is used in particular by psychol- 
ogists for cognitive preferences. Typical biases seem to be quite weak, so that even detecting 
them is an important part of research. By contrast, linguists' constraints are often very strong, 
and controversy concerns primarily the nature (functional-adaptive, representational, muta- 
tional) and the interaction of the constraints. 
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FUNCTIONAL-ADAPTIVE CONSTRAINTS are the kinds of factors that have been in- 
voked by functionalists to explain cross-linguistic distributions (e.g. Tomlin 1986; 
Malchukov 2008; Hawkins 2014; among many others). For example, phonologi- 
cal inventories favour five-vowel systems because these make the best use of the 
acoustic space (De Boer 2001), and case systems favour overt ergatives for low- 
prominence nominals and overt accusatives for high-prominence nominals be- 
cause ofthe association between roles and prominence status (Dixon 1994). These 
constraints are called functional-adaptive rather than merely functional to em- 
phasize their role in explaining systems, not usage (the functioning of language). 
Functional linguists often focus on understanding the functioning of language 
in usage, but here my interest is in explaining how systems come to have prop- 
erties that facilitate communication.* Good (2008) uses the term “external expla- 
nation” in roughly this sense (cf. also Newmeyer 1998: $3.4), but all four types of 
constraints are external in that they are not part of the system. ("System-internal 
explanation" is just another word for general regularities of language-particular 
systems, cf. $2.1 above; I do not think that the notion of causality is relevant for 
such statements, so all causal explanatory factors are external.) 

REPRESENTATIONAL CONSTRAINTS are the kinds of factors that have been in- 
voked by generativists to explain grammatical universals, as noted in $1. In the 
Principles and Parameters framework (Chomsky 1981), they were called the prin- 
ciples of Universal Grammar. For example, the principles of X-bar theory or 
binding theory have been regarded as representational constraints, as well as 
universal features and hierarchies of functional categories such as determiner 
(e.g. Cinque 1999). The general idea is that "the unattested patterns do not arise 
as they cannot be generated in a manner consistent with Universal Grammar” 
(Smith et al. 2018). Representational constraints are usually regarded as very 
strong, i.e. as restrictions (and thus Universal Grammar is said to be restrictive; 
cf. also Haspelmath 2014).? However, there is no intrinsic reason why represen- 
tational constraints could not be weaker preferences, e.g. why there could not be 
a weak innate preference to put elements into a "determiner" category (though 
this possibility is almost never considered by linguists). In Good's (2008) survey, 
representational constraints are treated under the label of "structural explana- 


^ Another term for functional-adaptive constraint is “naturalness parameter" (Dressler et al. 
1987), and functional-adaptive changes have been called "natural changes". 

*Cognitive linguists have also sometimes invoked representational constraints to explain uni- 
versals, though these are not referred to as UG. An example might be the idea in Croft (1991) 
that all event types are modeled on the basic force-dynamic agent-patient event type. This is 
not very strong, i.e. it is a preference, but apparently a preference having to do with cognitive 
representations, not with communicative or processing preferences. 
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tions", but this term (like "system-internal explanations") is better reserved for 
general statements of regularities of language-particular systems. 

MUTATIONAL CONSTRAINTS (or change constraints) are constraints on possi- 
ble diachronic transitions or possible diachronic sources, which can have an ef- 
fect on synchronic distributions. For example, if nasal vowels only ever arise 
from VN sequences, this explains that all languages with nasal vowels also have 
nasal consonants, and that nasal vowels are rarer than oral vowels in the lex- 
icon (Greenberg 1978). Likewise, if infixes only ever arise by metathesis from 
adfixes (= prefixes or suffixes), this explains that they only occur in peripheral 
position (Plank 2007: 51). And if adpositions only arise from nouns in possessor- 
noun constructions, this explains that their position correlates with the position 
of possessed nounspossessive construction, as noted in $1. The notion of muta- 
tional constraints is not new (Plank 2007: 82 calls them “diachronic laws”), but 
I introduce a new term here in order to make clear that the causal factor is lo- 
cated within the process of change, rather than diachronic change merely real- 
izing a pattern that is driven by functional-adaptive constraints (see 83 below). 
One could also frame the contrast between mutational constraints and functional- 
adaptive constraints in terms of source-oriented vs. result-oriented factors (Cristo- 
faro 2017), or one could say that mutational constraints locate the causal factors 
within the mechanisms of change (Bybee 2006). These are just alternative ways 
of saying that cross-linguistic distributions are due to mutational constraints. 

Finally, ACQUISITIONAL CONSTRAINTS are factors that impact the acquisition 
of language and that have an effect on cross-linguistic distributions. Such con- 
straints are briefly discussed by Anderson (2016), but they do not seem to play a 
big role in linguistics (but cf. Levshina 2019 [this volume] for discussion). Gen- 
erative linguists who are concerned with learnability issues generally assume 
that what can be represented can also be learned, so that there is no distinction 
between representational constraints and what can be learned. This type of con- 
straint is mentioned here only in passing, for the sake of completeness. It will 
play no role in what follows. 


3 Two ways in which causal explanations involve 
diachrony 


The peculiar term mutational constraint that I adopt here may raise questions: Is 
it necessary to use a new term for something that is very straightforward? 


*Informally, instead of talking about “result-oriented factors", one could also say that functional- 
adaptive constraints are "pull forces" that attract the variable development into a certain pre- 
ferred state. 
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The reason I am using this term here is that the possible alternatives “di- 
achronic constraint" or “diachronic explanation" are not fully clear. First of all, 
diachronic explanations may simply be explanations of diachronic changes, but 
here we are concerned with causal factors leading to universals. Second, “di- 
achronic" and "historical" are often used interchangeably (cf. Good's (2008) term 
"historical explanation" for what I call mutational explanations), and when we 
speak about historical explanations, we often mean contemporary idiosyncrasies 
that are better understood if one knows their origins (e.g. the vowel alternation in 
foot/feet finds a historical explanation in the earlier productive pattern of vowel 
fronting conditioned by a high vowel in the following syllable). But all of this 
is irrelevant in the present context, where we are concerned with possible and 
impossible pathways (and sources) of change. 

Most importantly, the term mutational constraint is necessary because there 
are two ways in which causal explanations involve diachrony: synchronic dis- 
tributions may be diachronically DETERMINED, or they may come about by the 
diachronic REALIZATION of preferred outcomes. The term mutational constraint 
highlights the fact that change is seen as a causal factor here, not merely the 
way in which the cross-linguistic distributions arise. By contrast, when universal 
tendencies are explained by functional-adaptive constraints, diachronic change 
merely serves to realize the adaptation. It plays an important role, indeed a cru- 
cial role, because functional adaptation is impossible without change. In this 
sense, functional-adaptive explanations are also diachronic (cf. Haspelmath 19992). 
But functional-adaptive change is not the cause of the adaptation - the cause is 
the facilitation of communication for speaker and hearer. Mutational constraints 
are situations where the causal factor resides in the change itself. 

Two types of mutational constraints may be distinguished: Source constraints 
and directionality constraints. Most of the diachronic regularities discussed by 
Cristofaro (2017) concern constraints on possible sources. The best-known di- 
rectionality constraint is the irreversibility of grammaticalization (Haspelmath 
1999b, 2004). 

Another reason for avoiding the terms "diachronic constraint" or “diachronic 
explanation” is that they invite a contrast with “synchronic constraint" and “syn- 
chronic explanation". But these terms are themselves very problematic, because 
they seem to conceive of explanation in noncausal terms. The term “synchrony” 


"Mutational constraints are themselves in need of explanation, of course. I say nothing about 
this in the current paper, because it is already long and complicated enough. Their explanation 
could itself be "functional" in some sense (to be made more precise), but it cannot be functional- 
adaptive, because the latter type of explanation (as I understand it here) by definition applies 
only to language systems, not to changes. 
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has a clear application with reference to an abstract, idealized language system 
(de Saussure's langue), but in $2.11 noted that language-particular system reg- 
ularities should be described in terms of constructions or rules, and that causal 
constraints cannot play any role in them.? 

Instead of “mutational constraint”, one could use “constraint on change" (as 
in the title of this paper), but the new term “mutational” is more salient (it can 
be found more easily in automatic text searches), and since it is more specific, it 
can be used in new combinations like “mutational explanation” (an explanation 
in terms of a mutational constraint) or “mutational approach’. 


4 Universals are not explained by recurrent pathways of 
change, only by constraints on change 


It has long been known that there are recurrent kinds of changes in phonology 
(lenition of consonants between vowels, diphthongization of long vowels, assim- 
ilation, etc.), and over the last few decades, recurrent changes in morphosyntax 
have become prominent as well, especially changes falling under the broad cat- 
egory of grammaticalization (Lehmann 2015[1982]; Heine et al. 1991; Bybee et al. 
1994; and much related work). 

Bybee (2006) highlights recurrent or common pathways of change in the tense- 
aspect domain (perfectives coming from anteriors and ultimately from comple- 
tive, resultative or movement constructions; imperfectives coming from progres- 
sives and ultimately from locational or reduplicative constructions; and futures 
coming from volitional or movement constructions), and makes the claim that 
"the true universals of language are the mechanisms of change that propel the 
constant creation and recreation of grammar” (Bybee 2006: 179-180). 

But she does not distinguish clearly between recurrent pathways of change 
and constraints on possible changes. There is no doubt that the tense-aspect 
changes that she discusses are widespread and significant developments, but no- 
body knows how widespread they are, compared to other possible changes. There 
are many perfective, imperfective and future markers about whose sources we 
know little, or markers whose sources do not fit into any of Bybee’s categories. 
It is true that the recurrence of the changes makes it virtually certain that the 


*Of course, in practice linguists often use the terms “synchronic explanation" and “synchronic 
constraint”, but what they mean is either (i) very general language-particular statements (“de- 
scriptive explanations”, §2.1), or (ii) representational constraints. The latter are biological limi- 
tations, which can hardly be labeled felicitously with the Saussurean term synchronic. 
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similarities are not accidental, but we do not know enough about tense-aspect de- 
velopments to assert with confidence that no other sources are possible or likely, 
nor even that these sources are clearly predominant over other possibilities. 

In one passage Bybee asserts that "the diachronic paths present much stronger 
cross-linguistic patterns than any comparison based solely on synchronic gram- 
mars" (2006: 180; see also Bybee 2008: 169). But her evidence is not sufficient to 
show this, at least for tense and aspect, where the pathways of change are highly 
diverse, and few people would venture a claim that certain kinds of change are 
impossible or highly unlikely. 

In order to explain universal tendencies, one needs to appeal to something 
that is stronger than "recurrent (or common) pathways of change", namely muta- 
tional constraints, of the type mentioned earlier. Such constraints allow causal ex- 
planations of synchronic cross-linguistic distributions, just like functional-adap- 
tive constraints. In phonological change, also discussed by Bybee, some com- 
mon pathways may indeed qualify as mutational constraints: It could well be 
that changes involving [h] are highly uniform (especially [s]/[x] > [h] > Ø), so 
that we are dealing with a mutational constraint, not just a recurrent pathway.” 
Since such mutational constraints entail certain synchronic distributions, they 
qualify as true explanations, and if a synchronic distribution can be explained by 
a change constraint, it is not "accidental" (as Collins 2019 [this volume] calls the 
universal that adposition order correlates with verb-object order).!" 

At this point it is reasonable to ask how one can distinguish in practice be- 
tween recurrent pathways and mutational constraints. The way to distinguish be- 
tween synchronic cross-linguistic regularities and recurrent patterns is by gath- 
ering representative world-wide data samples (82.2), and in principle, one would 
have to do the diachronic counterpart in order to establish a mutational con- 
straint. As Collins (2019 [this volume]: 54) puts it, ^we need large databases of 
attested grammaticalisation pathways". This is not very practical, however, as 
there are few solidly attested cases of grammaticalization, mostly from Euro- 
pean (and a few Asian) languages, and most of what we think we know about 


“It is true, of course, that there are some really interesting constraints on morphosyntactic 
change, notably the constraint that grammaticalization cannot be reversed (Haspelmath 1999b). 
However, such mutational constraints need not give rise to synchronic universal tendencies. 
Grammaticalization as such does not result in any universal tendencies, and Bybee (2006: 88) is 
apparently right that the lenition of [s] or [x] via [h] to Ø does not give rise to any synchronic 
universals either. 

Tt could be that Collins thinks that only representational or functional-adaptive constraints 
can explain synchronic universals, or it could be that he does not think that the sources of 
adpositions are sufficiently constrained. See the next paragraph for more on that possibility. 
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general change patterns is based on indirect inferences and cannot be subjected 
to statistical testing the way this is possible with synchronic patterns. Thus, in 
practice, linguists rely on their general experience when making judgments, or 
they cite a range of examples to persuade their colleagues. This method is much 
less rigorous than the study of synchronic regularities, but it seems to be un- 
controversial to assert that in general, both types of diachronic regularities ex- 
ist: mutational constraints (where a particular outcome has no other possible 
source), and recurrent changes. This is all I want to argue for in this paper, and 
I make no strong claims about particular instances (e.g. whether adpositions are 
constrained to arise only from possessed nouns and transitive verbs, or whether 
these are merely recurrent sources). 


5 Multi-convergence can only be explained by 
functional-adaptive constraints 


Since mutational constraints are one possible source of synchronic universals, it 
could be that in fact all synchronic universals are due to mutational constraints 
of one kind or another, and that functional-adaptive and representational con- 
straints are not needed. This is a fairly radical position, but Cristofaro (2017) 
comes close to adopting it. 

Perhaps the strongest reason to believe that we also need functional-adaptive 
explanation is that there are many cases of multi-convergence, i.e. situations in 
which a uniform result comes about through diverse pathways of change that 
yield a very similar result. For example, I note in Haspelmath (2017) that inalien- 
able adpossessive constructions tend to have shorter coding or zero, whereas 
alienable adpossessive constructions have overt or longer coding, and I also ob- 
serve that these patterns can come about in different ways. The inalienable pat- 
tern may be shorter because of special shortening, or it may be shorter because 
only the alienable pattern got a special new marker. Kiparsky (2008: 37) makes 
a very similar argument against Garrett's (1990) explanation of split ergativity 
in mutational terms, noting that "[Garrett's] historical account is insufficiently 
general [...] because the phenomenon to be explained has several historical sour- 
ces". 

Interestingly, two of the advocates of mutational explanations of universal ten- 
dencies observe the heterogeneity of the pathways themselves. Anderson (2016) 
is concerned with case-marking patterns in perfective and imperfective aspects 
across languages, and Cristofaro (2017) is concerned with the coding asymmetry 
of zero singulars and overt plurals: 
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As it happens, common sources for a new perfective, on the one hand, and 
for a new imperfective, on the other, converge on similar patterns of split 
ergativity, although they are quite unrelated to each other. (Anderson 2016: 
23; cf. also Anderson 1977) 


Different instances of the same configuration can also be a result of very 
different processes. For example, phonological erosion, meaning transfer 
from a quantifier to an accompanying element, and the grammaticalization 
of distributives into plural markers can all give rise to a configuration with 
zero marking for singular and overt marking for plural, yet they do not 
obviously have anything in common. (Cristofaro 2017: 18-19) 


Anderson and Cristofaro are thus aware of the multi-convergence patterns, but 
for some reason they do not draw the conclusion that we need an additional 
causal factor to explain the convergence - and as far as I can see, this factor can 
only be a functional-adaptive constraint." 

The convergence of diverse processes on a uniform result could conceivably 
be accidental, but in this case it could not explain a universal tendency, because 
a universal tendency is by definition non-accidental. A universal tendency still 
holds if more and more languages are looked at, whereas accidental similarities of 
the results of diverse processes would not be repeated if more phenomena were 
considered. On the analogy of biological usage, where "convergent evolution" 
refers to the independent development of similar traits for adaptive reasons, one 
should probably avoid the term "convergence" if one thinks that the similarities 
are accidental and will not be confirmed by a larger sample. Thus, Anderson and 
Cristofaro should think of their observations in terms of coincidental similarity 
rather than convergence. 


6 Functional-adaptive explanations need not specify 
pathways of change 
One point of criticism of functional-adaptive explanations is that they do not 


say how the change comes about. Especially Bybee and Cristofaro have argued 
that for a functional explanation of cross-linguistic regularities to be accepted, it 


"In principle, it could also be a representational constraint (i.e. Universal Grammar), but since 
the patterns involve implicational universals, this would be difficult to argue for. In general, 
implicational universals cannot be easily explained by representational constraints. 
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must be shown how the functional motivation plays a role in the way in which 
the resulting patterns comes about. 

I agree that the functional motivation must play a role in the way in which the 
pattern comes about, but I do not agree that the manner in which it influences the 
change must be identified for a successful explanation. Below are two relevant 
quotations. 


[I]n language universals, causal factors are linguistic changes that create par- 
ticular synchronic states, and the existence of massive cross-language simi- 
larity in synchronic states implies powerful parallels in linguistic change. ... 
the validity of a principle as explanatory can only be maintained if it can 
be shown that the same principle that generalizes over the data also plays a 
role in the establishment of the conventions described by the generalization. 
(Bybee 1988: 352) 


These [functional] explanations ... have mainly been proposed based on the 
synchronic distribution of the relevant grammatical phenomena, not the ac- 
tual diachronic processes that give rise to this distribution in individual lan- 
guages. In what follows, it will be argued that many such processes do not 
provide evidence for the postulated dependencies between grammatical phe- 
nomena, and suggest alternative ways to look at implicational universals in 
general. (Cristofaro 2017: 10) 


The problem with Bybee's claim is that the changes are seen as causal factors 
themselves: Bybee does not seem to envisage the possibility of a “pull force" that 
increases the probability of change toward a particular kind of outcome, without 
determining the way in which the change comes about. Moreover, she formu- 
lates the requirement that one should be able to demonstrate that the functional- 
adaptive principle plays a role in the change, but this requirement is too strong. 
In general, we do not know much about language change and how and why it 
happens. The primary evidence for functional-adaptive explanations is the fit 
between the causal factor and the observed outcome. If there is a good fit, e.g. 
if languages overwhelmingly prefer the kinds of word orders that allow easy 
parsing (Hawkins 2014), or if they tend to show economical coding of grammat- 
ical categories (Haspelmath 2008), the best explanation is in functional-adaptive 
terms, as long as there is a way for languages to acquire these properties. The 
latter requirement is always met, as there are no synchronic states which can- 
not have arisen from other states. Thus, we may not know how exactly the zero 
singulars and overt plurals in Hebrew (e.g. sus ‘horse’, sus-im ‘horses’) may have 
come about, as they are found in much the same way in Proto-Semitic, but we 
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know various ways in which plurals can arise (Cristofaro 2013: $4), so there is 
no problem in assuming that the functional motivation of economical coding of 
the singular played a role in the development of the contrast. 

Cristofaro is right that when we look at the changes that give rise to apparently 
functionally motivated distributions, we do not (necessarily) find evidence that 
the changes were driven by the need to obey the functional constraints, but find- 
ing such evidence is not necessary for a successful explanation. The evidence for 
the functional motivation does not come from the manner in which the change 
happened, but from the fit between the motivation and the observed outcomes. 
If there is a universal tendency, and it can be explained by a universal motivat- 
ing factor, then that explanation should be accepted unless a better explanation 
becomes available. 

Explanations of regularities in the world-wide distribution of cultural traits 
often appeal to functional-adaptive factors in adjacent fields as well. For exam- 
ple, anthropologists sometimes explain religion by prosociality, or monogamy 
by group-beneficial effects (e.g. Paciotti et al. 2011; Henrich et al. 2012). The issue 
here is whether better explanations are available, not whether there is a way for 
religion or marriage to develop. We know little about how religion and marriage 
first arose or generally arise in societies, and it is very difficult to study the di- 
achronic developments. But we can try to correlate structural traits of human so- 
cieties with other traits and draw inferences about possible causal factors. There 
is no perceived need in this literature to show that the mechanisms by which 
religion or monogamy arise must be of a particular type. Basically, when the 
result is preferred, any kind of change can give rise to the result, and we do not 
need to understand the nature of the change, let alone show that the change was 
motivated by the result. 

Another striking example from linguistics is the shortness of frequent words, 
which is surely adaptive. But there are quite diverse paths to shortness. Accord- 
ing to Zipf (1935), shorter words are shorter because they underwent clipping 
processes (e.g. laboratory » lab), and according to Bybee (2007: 12), short words 
are short because "high-frequency words undergo reductive changes at a faster 
rate than low-frequency words [...] the major mechanism is gradual phonetic 
reduction". But actually in most cases, rarer words are longer because they are 
(originally) complex elements, consisting of multiple morphs, e.g. horse vs. hip- 
popotamus, car vs. cabriolet, church vs. cathedral. Drastic shortening of longer 


The same is true for adaptive explanations in evolutionary biology: The fact that wings are 
adaptive can be inferred from the way wings are used by animals, and we do not expect that 
wings arise in uniform ways (wings of birds, bats and insects have diverse origins and arose 
by diverse paths of change, whose nature is not relevant to the adaptive explanation). 
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words seems to occur primarily in the modern age with its large number of tech- 
nical and bureaucratic innovations, but even here, clipping is only one of many 
possibilities; for example, Ronneberger-Sibold (2014) discusses a number of fairly 
diverse "shortening techniques" in German. What unites all of these processes is 
only one feature: the outcomes of the changes, which are functionally adapted. 

When Cristofaro (2014: 297) writes that "any model of the principles that lead 
to the use of particular constructions [...] should take into account the diachronic 
development of these constructions, rather than just their synchronic distribu- 
tion”, I certainly agree, because I think that the diachronic developments can 
illuminate the functional adaptation, and a close study of whatever we can learn 
about diachrony can tell us whether any mutational constraints might play a role. 
But when there is strong evidence for a universal tendency and there is a good 
functional-adaptive explanation available, the diachronic evidence is not strictly 
speaking necessary. 


7 A cost scale of constraints 


What are we to do when there are several possible explanations, using different 
kinds of causal factors? For example, what do we do when word-order corre- 
lations can be explained either by functional adaptation (processing efficiency, 
Hawkins 2014) or by mutational constraints? Or when case-marking splits can 
be explained either by Universal Grammar (Kiparsky 2008: $2.3) or by efficiency 
of coding? 

The answer is that there is a COST SCALE of constraints: 


(3) less costly < » more costly 


mutational » functional-adaptive » representational constraints 


The "cheapest" type of explanation is the mutational mode, because language 
change can be observed, and if we find that certain changes simply do not occur 
(for whatever reason), we do not need to make more far-reaching claims. Thus, 
Bybee (2010: 111) discusses the Greenbergian word order correlations and notes 
that "grammaticalization gives us the correct orders for free" — a formulation that 
reflects the assessment that mutational constraints do not involve any additional 


“cost”. 


®Cf. also the similar argumentation in Kiparsky (2008: 33), in connection with a different phe- 
nomenon (involving reflexives): ^Ihat is really all that needs to be said [...] The historical expla- 
nation covers the data perfectly? I completely agree with Bybee and Kiparsky in this respect. 
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The next type of explanation on the scale appeals to functional-adaptive con- 
straints. These are more costly because we cannot observe their effects directly 
and have to rely heavily on inference. But they are less costly than represen- 
tational constraints, because they are far more general, applying also in other 
domains of cognitive processing and communication, often also in nonhuman 
animals. Again, this is not really controversial: In his chapter on Universal Gram- 
mar, Jackendoff (2002: 79) says that ^we should be conservative in how much lin- 
guistic structure we ascribe to an innate UG. We should welcome explanations 
of linguistic universals on more general cognitive grounds.” 

It is only when we observe a cross-linguistic regularity that cannot be ex- 
plained either by mutational constraints or by functional-adaptive constraints 
that we need to appeal to representational constraints. These involve the most 
specific (and thus most costly) mechanism, which should only be invoked as a 
last resort. 


8 Conclusion 


In this paper I have argued that cross-linguistic regularities may be explained ei- 
ther by mutational constraints or by functional-adaptive constraints (or perhaps 
by representational constraints, as in generative grammar) (82). Both kinds of ex- 
planations involve diachrony, but in different ways: Mutational constraints are 
constraints on possible sources or pathways of change, while functional-adaptive 
constraints influence the results of changes (83). In order to explain a universal 
tendency, we need to appeal to mutational constraints; merely noting a frequent 
pathway of change is not enough (84). We can be sure that a cross-linguistic 
regularity is due to a functional-adaptive rather than a mutational constraint if 
there are diverse pathways of change which converge on a single result ($5). The 
functional-adaptive constraint must influence language change in such a way 
that change in a particular direction becomes more likely, but this need not be 
visible in the change itself (86). But when we have good reasons to think that 
there is a mutational constraint, it takes precedence over functional-adaptive and 
representational explanations ($7). 

Thus, the answer to the question in the title of this paper (“Can cross-linguistic 
regularities be explained by constraints on change?") is: Yes, some regularities 
can apparently be explained in this way, but clearly not all of them. There re- 
mains an important role for functional-adaptive constraints in explaining lan- 
guage universals. 
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Chapter 2 


Taking diachronic evidence seriously: 
Result-oriented vs. source-oriented 
explanations of typological universals 


Sonia Cristofaro 


University of Pavia 


Classical explanations of typological universals are result-oriented, in that particu- 
lar grammatical configurations are assumed to arise because of principles of op- 
timization of grammatical structure that favor those configurations as opposed 
to others. These explanations, however, are based on the synchronic properties 
of individual configurations, not the actual diachronic processes that give rise to 
these configurations cross-linguistically. The paper argues that the available evi- 
dence about these processes challenges result-oriented explanations of typological 
universals in two major ways. First, individual grammatical configurations arise 
because of principles pertaining to the properties of particular source construc- 
tions and developmental mechanisms, rather than properties of the configuration 
in itself. Second, individual configurations arise through several distinct diachronic 
processes, which do not obviously reflect some general principle. These facts point 
to a new research agenda for typology, one focusing on what source construc- 
tions and developmental mechanisms play a role in the shaping of individual cross- 
linguistic patterns, rather than the synchronic properties of the pattern in itself. 


1 Introduction 


In the functional-typological approach that originated from the work of Joseph 
Greenberg, language universals (henceforth, typological universals) are skewed 
cross-linguistic distributional patterns whereby languages recurrently display 
certain grammatical configurations as opposed to others. Explanations for these 
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patterns are usually result-oriented, in the sense that at least some of the rele- 
vant configurations are assumed to arise because of some postulated principle 
of grammatical structure, which favors those particular configurations and dis- 
favors other logically possible ones. 

For example, a number of word order correlations have been explained by as- 
suming that speakers will recurrently select particular word orders as opposed to 
others because these orders lead to syntactic structures that are easier to process 
(Hawkins 2004, among others). Another case in point is provided by explanations 
of the use of explicit marking for different grammatical meanings, for example 
the use of overt marking for different number values, or that of dedicated case 
marking for different NP types occurring in particular argument roles. Cross- 
linguistically, explicit marking may be restricted to less frequent meanings, for 
example plural rather than singular, animate rather than inanimate P arguments, 
or inanimate rather than animate A arguments, but is usually not restricted to 
more frequent meanings.! This has been assumed to reflect a principle of econ- 
omy whereby speakers will tend to use explicit marking only when they really 
need to do so. Explicit marking can be restricted to less frequent meanings be- 
cause more frequent ones are easier to identify, and hence less in need to be dis- 
ambiguated (Greenberg 1966; Corbett 2000; Croft 2003; Haspelmath 2006; 2008). 

These explanations are based on the synchronic properties of the relevant dis- 
tributional patterns, not the actual diachronic processes that shape these distri- 
butions from one language to another. For example, assumptions about the role 
of processing ease in determining particular word order correlations are based 
on the synchronic syntactic configurations produced by particular word orders, 
not the actual diachronic origins of these orders from one language to another. 
Similarly, the idea that the use of explicit marking reflects economy is based on 
the synchronic cross-linguistic distribution of particular constructions across dif- 
ferent contexts (e.g. zero vs. overt marking across singular and plural, dedicated 
case marking across animate and inanimate A and P arguments), not the actual 
diachronic processes that give rise to this distribution in individual languages. 

This paper discusses various types of diachronic evidence about the cross- 
linguistic origins of two phenomena that have been described in terms of typolog- 
ical universals, the distribution of accusative vs. ergative case marking alignment 
across different NP types and that of zero vs. overt marking across singular and 


plural. 


!Following a standard practice in typology (see, for example, Comrie 1989 or Dixon 1994), the 
labels A, P and S are used throughout the paper to refer to the two arguments of transitive 
verbs and the only argument of intransitive verbs. 
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This evidence, it will be argued, challenges classical, result-oriented expla- 
nations of typological universals in two major ways. First, recurrent grammat- 
ical configurations cross-linguistically do not appear to arise because of prin- 
ciples that favor those particular configurations in themselves. This challenges 
the idea that these principles play a role in the emergence of the distributional 
patterns described by the relevant universals. Second, individual configurations 
arise through several distinct diachronic processes, which do not obviously re- 
flect some general principle. This challenges the idea that explanations for par- 
ticular distributional patterns can be read off from the synchronic properties of 
the relevant grammatical configurations, because these properties can originate 
differently in different cases. These facts call for a source-oriented approach to 
typological universals, one in which the patterns described by individual univer- 
sals are accounted for in terms of the actual diachronic processes that give rise 
to the pattern, rather than the synchronic properties of the pattern in itself. 


2 The animacy/referential hierarchy: Some possible 
origins of alignment splits in case marking 


One of the most famous typological universals is the animacy/referential hierar- 


chy in (1): 


(1) 1st person pronouns > 2nd person pronouns > 3rd person pronouns > 
human > animate > inanimate (Croft 2003: 130, among others) 


Among other phenomena, this hierarchy captures some recurrent splits in the 
distribution of accusative and ergative case marking alignment across different 
NP types. Accusative alignment can be restricted to a left end portion of the hier- 
archy (e.g. pronouns, human and animate nouns), but is usually not restricted to 
a right end portion of the hierarchy (e.g. inanimate nouns, nouns as opposed to 
pronouns). Conversely, ergative alignment is sometimes restricted to a right end 
portion of the hierarchy (e.g. inanimate nouns, nouns as opposed to pronouns, 
nouns and 3rd person pronouns), but is usually not restricted to a left end por- 
tion of the hierarchy (1st/2nd person pronouns, pronouns as opposed to nouns, 
pronouns and animate nouns). 

A classical result-oriented explanation for this distribution invokes the econ- 
omy principle mentioned in Section 1. Speakers tend to use dedicated case mark- 
ing only when it is really needed, that is, when some grammatical role is more 
in need of disambiguation. The NPs towards the right end of the hierarchy (inan- 
imates, nouns as opposed to pronouns) are more likely to occur as P arguments, 
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hence, when they do, the P role is relatively easy to identify, and hence less in 
need of disambiguation. Dedicated case marking for P arguments, leading to ac- 
cusative alignment, may then be limited to the NPs towards the left end of the 
hierarchy (pronouns, animate nouns). By contrast, these NPs are more likely to 
occur as A arguments, hence, when they do, the A role is less in need of disam- 
biguation. Dedicated case marking for A arguments, leading to ergative align- 
ment, may then be limited to the NPs towards the right end of the hierarchy 
(Silverstein 1976; Dixon 1979; 1994; Comrie 1981; DeLancey 1981; Song 2001; Croft 
2003). 

This explanation, however, is not supported by the available diachronic evi- 
dence about the origins of the relevant grammatical configurations across lan- 
guages. In many cases where accusative or ergative alignment is restricted to 
particular NP types, the relevant alignment pattern is a result of the development 
of an accusative or ergative marker through the reinterpretation of a pre-existing 
element with similar distributional restrictions. In some cases, for example, ac- 
cusative markers restricted to pronominal, animate or definite direct objects are 
structurally identical to topic markers. This is illustrated in (2) for Kanuri. 


(2 Kanuri (Nilo-Saharan; Cyffer 1998: 52) 


a. Müsashí-ga cüro. 
Musa 3sG-Acc saw 


*Musa saw him: 
b. wü-ga 
1sc-as.for 


‘as for me’ 


In such cases, the accusative marker plausibly originates from the topic marker in 
contexts where the latter refers to aP argument and is reinterpreted as a marker 
for this argument (‘As for X’ > X ACC’: see, for example, Rohlfs 1984 and Pen- 
sado 1995 for Romance languages, and König 2008 for several African languages). 
Topics are usually pronominal, animate and definite, so topic markers are mainly 
used in the same contexts as the resulting accusative markers. 

Ergative markers not applying to first and second person pronouns have been 
shown to originate from various types of source elements not applying to these 
pronouns either. Sometimes, for example, the ergative marker is derived from an 
indexical element, such as a demonstrative or a third person pronoun, as illus- 
trated in (3) for Bagandji. McGregor (2006; 2008) argues that in such cases the 
indexical element is originally used to emphasize the referent ofthe A argument, 
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as this referent is a new or unexpected agent. This strategy does not apply to 
first and second person pronouns because the referents of these pronouns are 
typically expected agents. 


(3) Bagandji (Australian: Hercus 1982: 63) 
Yadu-duru gàndi-d-uru-ana. 
wind-DEM/ERG carry-FUT-3SG.SBJ-3SG.OBJ 


‘This wind will carry it along / The wind will carry it along’ 


In other cases, the ergative marker is derived from a marker used to encode 
instruments in transitive sentences with no overt third person arguments. In 
these sentences, the instrument can be reinterpreted as an agent, thus evolving 
into the A argument of the sentence. As a result, the marker originally used for 
the instrument becomes an ergative marker. This process has been reconstructed 
by Mithun (2005) for Hanis Coos, illustrated in (4). Instruments are typically 
inanimate, so the relevant markers do not usually occur with first and second 
person pronouns. 


(4) Hanis Coos (Coosan; Mithun 2005: 84) 
K’win-t x=mil:agots. 
shoot-TR OBL/ERG=arrow 


“An arrow shot (him). (from '(He) shot at him with an arrow.) 


Restrictions in the distribution of particular alignment patterns across differ- 
ent NP types can also be a result of phonological processes targeting a subset 
of these NPs. In English, for example, accusative alignment became restricted to 
pronouns as a result of nouns losing the relevant inflectional distinctions due to 
sound change, as illustrated in Table 1. 


Table 1: Pronominal and nominal declension in late Middle English 
(Blake 2001: 177-179) 


Jet person ‘name’ 


NOM ik name 
ACC me name (from naman) 


In Louisiana Creole, A, S and P arguments were originally undifferentiated 
for both nouns and pronouns. Pronominal A and S forms, however, underwent 
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phonological reduction, plausibly due to their high discourse frequency. As a re- 
sult, as can be seen from Table 2, pronouns developed distinct forms for A and S 
arguments on the one hand and P arguments on the other, while nominal A, S, 
and P arguments remained undifferentiated. This led to an accusative case mark- 
ing alignment pattern restricted to pronouns (Haspelmath & APiCS Consortium 
2013). 


Table 2: Pronominal declension in Louisiana Creole (Haspelmath & 
APiCS Consortium 2013) 


Subject Object 


Louisiana Creole 1SG mo mwa 
2SG to twa 


These various processes do not appear to be triggered by the fact that, in the re- 
sulting grammatical configurations, dedicated case marking is restricted to roles 
more in need of disambiguation. In some cases, a pre-existing element is reinter- 
preted as a marker for a co-occurring argument. Topic markers are reinterpreted 
as markers for a co-occurring P argument, and demonstratives and third person 
pronouns are reinterpreted as markers for a co-occurring A argument. This is 
a metonymization process triggered by the contextual co-occurrence of the rele- 
vant elements. In other cases, a pre-existing element evolves into a case marker as 
a result of the reanalysis of the argument structure of the construction. Such pro- 
cesses are plausibly due to meaning similarities between the source construction 
and the resulting construction, for example, instruments can be reinterpreted 
as agents because of their role in the action being described, particularly in the 
absence of an overtly expressed agent. In yet other cases, an existing alignment 
pattern becomes restricted to particular NP types because other NPs, due to their 
phonological properties, lose their inflectional distinctions as a result of regular 
sound change. Finally, particular NPs may develop distinct forms for some argu- 
ment roles as a result of the original forms undergoing phonological reduction 
due to their discourse frequency. 

Restrictions in the distribution of accusative and ergative aligment, as deter- 
mined by individual processes, directly follow from restrictions in the distribu- 
tion of various source constructions, or in the domain of application of particular 
developmental mechanisms (such as particular phonological processes). These 
restrictions too, then, cannot actually be taken as evidence for principles that fa- 
vor the resulting grammatical configurations independently of particular source 
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constructions and developmental mechanisms. This is further supported by the 
fact that, when the source constructions or the developmental mechanisms in- 
volved are not subject to particular distributional restrictions, the distribution of 
accusative or ergative alignment does not display those restrictions either. 

For example, accusative markers sometimes originate from ‘take’ verbs in con- 
structions of the type "Take X and Verb (X), where the ‘take’ verb is reanalyzed 
as a marker for its former P argument (Lord 1993; Chappell 2013, among several 
others). The P arguments of ‘take’ verbs can be pronominal, nominal, animate or 
inanimate (e.g. ‘take him, it, the child, the sword’), and the resulting accusative 
markers apply to all of these argument types. This is illustrated for Twi in (5), 
where the accusative marker de, derived from a 'take' verb, applies to both ani- 
mate and inanimate P arguments. 


(5 Twi (Niger-Congo; Lord 1993: 66, 79) 


a. Wo-de no yee osafohene. 
they-Acc him make captain 


‘They made him captain’ 


b. O-de afoa ce boha-m. 
he-acc sword put scabbard-inside 


‘He put the sword into the scabbard’ 


Accusative and ergative markers can also develop from the reanalysis of pos- 
sessor or oblique markers used on the notional A or P arguments of various types 
of source constructions, for example, 'X is occupied with the Verbing of Y > 'X is 
Verbing Y ACC’, ‘To X it will be the Verbing of Y’ > X ERG will Verb Y’, “Y is X's 
Verbed thing’, 'Y is Verbed by X’ > ‘X ERG Verbed Y’. These processes have been 
described for a wide variety of languages (see, for example, Harris & Campbell 
1995; Bubenik 1998; Gildea 1998; Creissels 2008). In such cases too, the relevant 
A and P arguments can be nominal, pronominal, animate or inanimate NPs (e.g. 
"Ihe Verbing of you, of it, of the house’; “You are Verbed, the house is Verbed’). 
The markers used for these arguments, then, will be used with all of these NPs, 
and the resulting accusative or ergative markers are used with all of these NPs 
too. This is illustrated in (6) and (7), where accusative and ergative markers de- 
rived in this way are used, respectively, with nominal inanimate and pronominal 
animate arguments. 


31 


Sonia Cristofaro 


(6  Wayana (Carib; Gildea 1998: 201) 
i-pakoro-n iri pək wai. 
1-house-Acc make occupied.with 1.be 
‘Tm (occupied with) making my house’ (originally Tam occupied with 
my house's making.) 


(7) Carifia (Carib; Gildea 1998: 169) 
A-eena-ri | i-'wa-ma. 
2-have-NMLZ 1-ERG-3.be 
‘I will have you! (from a nominalized construction “To me it will be your 


having?) 


On a similar note, loss of inflectional distinctions through sound change, lead- 
ing to the loss of particular alignment patterns, targets specific forms because of 
their phonological properties. This process, then, can affect different NP types 
cross-linguistically, provided that the relevant forms have specific phonological 
properties. This leads to different distributional restrictions for particular align- 
ment patterns. In English, as detailed earlier, the process affected nouns as op- 
posed to pronouns, leading to accusative alignment becoming restricted to pro- 
nouns. In Nganasan, however, a combination of sound change and analogical 
levelling led to a loss of inflectional distinctions for pronouns, but not for nouns 
(Filimonova 2005: 94-98). As a result, as can be seen from (8), accusative align- 
ment became restricted to nouns, even though this configuration should be dis- 
favored in terms of economy, because nominal P arguments are less in need of 
disambiguation than pronominal ones. 


(8) Nganasan (Uralic; Filimonova 2005: 94) 
a. Mənə nanunto mintol'i-?o-1). 
Iso 2SG.LOC-INSTR take-INDEF-2sG 


"You have taken me with you. (pronominals originally had dedicated 
accusative forms, e.g. mana-m ‘1sG-Acc’) 


b. gülcezo tundi-m tandarku-éü. 
wolf — fox-Acc chase-3sc.A 


"Ihe wolf is chasing the fox? 


If there were principles that favor or disfavor particular distributional restric- 
tions for accusative and ergative aligment because of properties of the resulting 
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grammatical configurations, we would not expect the development of these re- 
strictions to be tied to specific source constructions and developmental mecha- 
nisms. 

Finally, individual distributional restrictions develop through several distinct 
processes, which are rather different in nature and provide independent motiva- 
tions for the restriction. In some cases, particular restrictions arise as accusative 
and ergative case markers develop through processes of context-driven reinter- 
pretation of various types of source elements, which, for different reasons, are 
restricted in the same way. In other cases, the restrictions reflect the domain 
of application of different phonological processes. To the extent that different 
diachronic processes provide different motivations for particular distributional 
restrictions, explanations for these restrictions cannot be read off from the restric- 
tions in themselves, because these can originate differently in different cases. 


3 Some possible origins of zero vs. overt marking for 
singular and plural 


Another well-known typological universal pertains to the use of zero vs. overt 
marking for singular and plural. Languages can use overt marking for plural 
and zero marking for singular, but usually not the other way round. A classi- 
cal, result-oriented explanation for this pattern, as mentioned in Section 1, is in 
terms of economy. Speakers tend to use overt marking only for meanings that are 
more in need of disambiguation, and plural is more in need of disambiguation 
than singular due to its lower discourse frequency. As a result, overt marking 
can be limited to plural, whereas it will not be limited to singular (Greenberg 
1966; Croft 2003; Haspelmath 2008). This explanation, however, is not supported 
by a number of diachronic processes that lead languages to have zero marked 
singulars and overtly marked plurals. 

Often, in languages which make no distinction between singular and plural, an 
overt plural marker evolves through the reinterpretation of pre-existing expres- 
sions, whereas singulars retain zero marking. Sometimes, some expression takes 
on a plural meaning originally associated with a co-occurring expression. For 
example, in partitive constructions involving plural quantifiers (many of them' 
> ‘they pL’), the partitive marker can take on the meaning of plurality associated 
with the quantifier as the latter is lost. This process took place in Bengali, as 
illustrated in (9). 
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(9) Bengali (Indo-Aryan; Chatterji 1926: 735-736) 
a. ämhä-rä saba 
we-GEN all 
“all of us’ (14th century) 


b. chele-ra 
child-GEN 


‘children’ (15th century) 


In other cases, plurality becomes the central meaning of expressions inher- 
ently or contextually associated with this notion but originally used to encode 
other meanings, for example, distributive expressions (house here and there") or 
expressions of multitude (Call, people). This is illustrated in (10) and (11). 


(10) Southern Paiute (Uto-Aztecan; Sapir 1930-1931: 258) 
qanı / qagqa'ni 
house / house.DISTR 


‘house, houses’ 


(11) Maithili (Indo-Aryan: Yadav 1997: 69) 
jon sob 


laborer all 


‘laborers’ 


Another process that leads languages to have zero marking for singular and 
overt marking for plural is the elimination of an overt singular marker through 
regular sound change in a situation where both singular and plural are originally 
overtly marked. This was, for example, the case in English, where singular and 
plural were both originally overtly marked in most cases. The current configu- 
ration with zero marked singulars and -s marked plurals resulted from a series 
of sound changes that led to the elimination of all inflectional endings except 
genitive singular -s and plural -es (Mossé 1949). 

These various processes do not appear to be triggered by the higher need to 
disambiguate plural as opposed to singular. In some cases, an overt plural marker 
arises as a result of a metonymization process whereby plural meaning is trans- 
fered from a quantifier to some other component of a complex expression. This 
is plausibly due to the co-occurrence of the two. In other cases, some expressions 
evolves into a plural marker because it is contextually or inherently associated 
with the notion of plurality, and this notion becomes the central meaning of the 
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expression as some other meaning component is bleached. In yet other cases, a 
pre-existing overt singular marker is eliminated due to regular sound changes 
driven by the phonological properties of the marker. 

The end result of the various processes, the use of overt marking for plural 
rather than singular, is directly motivated in terms of the properties of particular 
source constructions and developmental mechanisms. In many cases, an overt 
marker is used for plural because the source construction is one associated with 
the notion of plurality. Alternatively, sound changes leading to the elimination of 
an overt marker target singular rather than plural markers due to the phonolog- 
ical properties of the former. The fact that overt marking is restricted to plural, 
then, cannot be taken as evidence for principles that favor this particular con- 
figuration independently of particular source constructions and developmental 
mechanisms. As in the case of accusative and ergative case marking alignment, 
this point is further supported by the fact that other source constructions and 
developmental mechanisms give rise to different configurations, that is, overt 
marking for both singular and plural, or just for singular. 

A case in point is provided by Kxoe, illustrated in Table 3 below. This language 
has gender markers derived from third person pronouns (Heine 1982). As the 
pronouns have overt singular and plural forms, the resulting gender markers 
also encode singular and plural, so that the language has overt marking not only 
for plural, but also for singular. 


Table 3: Gender/number markers and third person pronouns in Kxoe 
(Khoisan; Heine 1982: 211) 


SG PL 
Nouns M |öa-mä 6a-||u‘a ‘boy’ 
y 
F  |oa-hé |oa-dji ‘girl 
C  J0a-("à), |6a-dji 0a-nà ‘child’ 
Pronouns M xd-d, d-mä, i-ma — xà-||uá, á-||uá, i-||ua he’ 
F xa-he, a-hé, i-hé  xà-djí, á-djí, i-dji ‘she’ 
C  (xa-'d) xà-nà, á-nà, í-nà "t 


Also, as described above, partitive case markers can evolve into plural markers 
by taking on the plural meaning associated with a co-occurring plural quantifier. 
In expressions where the quantifier is singular (‘one of them’), however, this same 
process can lead to the development of singular markers, sometimes leading to a 
configuration where only singular is overtly marked. This was the case in Imonda, 
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which has zero marked plurals, but developed an overt non-plural (singular and 
dual) marker from a source case marker used in partitive constructions (Seiler 
1985: 38-39). 


(12) Imonda (Border; Seiler 1985: 194, 219) 


a. Agö-ianei-m ainam fa-i-köhö. 
women-NONPL-GL quickly CLF-LNK-go 
“He grabbed the woman! 

b. mag-m ad-ianéi-m 
one-GL boy-SRC-GL 


'to one of the boys' 


Similar observations apply to loss of number markers through sound change. 
This process can affect either singular or plural markers depending on the phono- 
logical properties of the marker. From one language to another, then, the process 
may lead either to zero marked singulars and overtly marked plurals, as detailed 
above for English, or to the opposite configuration. In the Indo-Aryan language 
Sinhala, for example, some inanimate nouns have overtly marked singulars and 
zero marked plurals (e.g. pot-a/pot 'book-sc/book.Pr"). This was a result of sound 
changes leading to the loss of the plural ending of a specific inflectional class in 
the ancestor language (Nitz & Nordhoff 2010: 250-256). Similarly, in Nchanti, a 
Beboid language, nouns in classes 3/4 have overt marking in the singular and zero 
marking in the plural, e.g. k"an/ kay 'firewood.sc/firewood.Pr, k"ee/kee moon. soi 
moon.PL’. Originally, both singular and plural were marked overtly through the 
two prefixes “u- and *i- respectively. As these were eliminated, the singular pre- 
fix led to the labialization of the initial consonant of the stem, while the plural 
prefix left no trace (Hombert 1980). 

Finally, just like distributional restrictions for accusative and ergative case 
marking alignment, the fact that a language uses zero marking for singular and 
overt marking for plural can be a result of a variety of diachronic processes, 
which lead to this particular configuration for different reasons. In many cases, 
both singular and plural are originally zero marked (i.e., the language makes no 
distinction between the two), but zero marking becomes restricted to singular 
because different expressions, for different reasons, evolve into plural markers. 
In other cases, both singular and plural are originally overtly marked, and sound 
change leads to the loss of singular markers due to their phonological proper- 
ties. Explanations for why overt marking is restricted to plural, then, cannot be 
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read off from this configuration in itself, because it can originate differently in 
different cases. 


4 Rare grammatical configurations and result-oriented 
explanations 


Result-oriented explanations of typological universals are crucially based on the 
fact that certain logically possible grammatical configurations are significantly 
rarer than others in the world's languages. This is usually accounted for by pos- 
tulating principles that both disfavor those configurations and favor some of the 
other configurations. For example, the rarity of configurations where singular is 
overtly marked and plural is zero marked is assumed to be due to the fact that 
these configurations are disfavored by economy, and hence will usually not oc- 
cur in the world's languages. This same principle is assumed to also favor the 
opposite configuration, zero marking for singular and overt marking for plural, 
thus providing a motivation for the occurrence of this configuration. 
Haspelmath (2019 [this volume]) uses this line of reasoning to claim that result- 
oriented explanations should be invoked even in cases where the development 
of some grammatical configuration is accounted for by the properties of particu- 
lar source constructions or developmental mechanisms, rather than synchronic 
properties of the configuration in itself. Haspelmath concedes that, in such cases, 
there is no direct evidence that the occurrence of the configuration is motivated 
by principles pertaining to its synchronic properties (functional-adaptive princi- 
ples, in his terminology). He argues, however, that this hypothesis is supported 
by two types of indirect evidence: the fact that other logically possible configu- 
rations are significantly rarer, and what he calls multi-convergence, the fact that 
different diachronic processes all lead to that particular configuration. Accord- 
ing to Haspelmath, these facts can only be accounted for by assuming that the 
occurrence of the configuration is ultimately motivated by principles that favor 
that configuration independently of the diachronic processes that give rise to it. 
Haspelmath draws a parallel with the notion of adaptiveness in evolutionary biol- 
ogy (and other domains): The development of particular traits is independent of 
the fact that those traits are adaptive to the environment, in the sense of confer- 
ring an evolutionary advantage to the organisms carrying them, but adaptiveness 
provides the ultimate explanation for their spread and survival in a population. 
There is, however, a logical fallacy in the idea that, if some principle motivates 
the non-occurrence of some configuration (and hence its rarity), then the occur- 
rence of some other configuration is motivated by the same principle. The fact 
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that some principle A provides the motivation for some phenomenon X can be 
framed as a logical implication, X — A (because X will always involve A, unless 
other motivations for X are also postulated). This implication means, however, 
that the absence of A will lead to phenomena different than X, that is, -A — 
-X, not that phenomena different from X are also motivated by A. This would 
be a distinct logical implication, -X — A, with a different truth table. For ex- 
ample, if the non-occurrence of configurations where singular is overtly marked 
and plural is zero marked (X) is assumed to be due to economy (A), this means 
that principles other than economy (-A) will lead to the occurrence of other con- 
figurations (-X), not that the latter phenomenon is also due to economy. This 
undermines the general logic of result-oriented explanations, including Haspel- 
math's argument: From the fact that some principle provides a motivation for the 
non-occurrence of some configuration, we cannot conclude that it also provides 
a motivation for the occurrence of other configurations. 

As for the multi-convergence argument, this ignores the fact that different 
diachronic processes can all lead to the same synchronic output for different rea- 
sons, as detailed in Sections 2 and 3. If the same synchronic output is motivated 
differently in different cases, multi-convergence cannot be taken as evidence for 
principles that favor that output independently of the individual processes that 
give rise to it. Instead, to the extent that the various processes recurrently take 
place in different languages, the cross-linguistic distribution of the output will 
be a combined result of the effects of each process. 

From a logical point of view, source-oriented explanations do not rule out that 
the cross-linguistic distribution of particular grammatical configurations may ul- 
timately also be determined by properties pertaining to the synchronic proper- 
ties of the configuration, as assumed by Haspelmath. For example, these factors 
could play a role in the transmission of the configuration from one speaker to an- 
other, or its retention across different generations of speakers. This would be the 
equivalent of the notion of adaptive evolution through natural selection in evolu- 
tionary biology: particular genetic traits do not develop because they they confer 
an evolutionary advantage to the organisms carrying them, but this provides the 
ultimate explanation for their distribution in a population.” 


? A referee suggests that this is similar to Lass's (1990) use of the notion of exaptation: particu- 
lar grammatical traits may lose their original function, but they are retained in the language 
because they are deployed for novel functions. This, however, is meant to account for why 
particular traits survive in a language despite losing their original function, not why they are 
selected over others, as is the case with result-oriented explanations of typological universals 
and explanations of biological evolution in terms of adaptiveness through natural selection. 
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In evolutionary biology, however, there is direct evidence for adaptiveness, in 
that particular genetic traits make it demonstrably more likely for the organisms 
carrying them to survive and pass them on to their descendants. For languages, 
on the other hand, there is generally no evidence that the fact that some gram- 
matical configuration conforms to the principles postulated in result-oriented 
explanations, for example economy, makes it more likely for that configuration 
to spread and survive in a speech community. In fact, there is a long tradition of 
linguistic thought in which the propagation of individual constructions within a 
speech community is entirely determined by social factors independent of par- 
ticular inherent properties of the construction (see, for example, McMahon 1994 
and Croft 2000 for reviews of the relevant issues and literature). 

In principle, there is another sense in which particular grammatical configura- 
tions could be adaptive. While individual configurations directly reflect the prop- 
erties of particular source constructions or developmental mechanisms, it could 
be the case that the specific diachronic processes that give rise to the configura- 
tion are ultimately favored by principles pertaining to its synchronic properties. 
For example, different processes of context-driven reinterpretation leading to 
overt marking for less frequent types of argument roles could be favored by the 
need to give overt expression to these roles. Similarly, different processes lead- 
ing to zero marking for singulars (zero marking becoming restricted to singular 
due to the development of an overt plural marker, phonological erosion of an 
existing overt singular marker) could be favored by the lower need to give overt 
expression to singular as opposed to plural. 

These assumptions, however, are not part of any standard account of the rele- 
vant processes in studies of language change (see Bybee et al. 1994: 298-300 and 
Slobin 2002: 381 for an explicit rejection of this view in regard to grammaticaliza- 
tion, as well as Cristofaro 2017 for more discussion). In fact, diachrony provides 
specific evidence against the idea that particular grammatical configurations de- 
velop both because of properties of particular source constructions or develop- 
mental mechanisms and because of principles that favor the configuration in 
itself. As detailed in Sections 2 and 3, different source constructions and devel- 
opmental mechanisms give rise to different grammatical configurations, even 
when this goes against some postulated principle that favors some of these con- 
figurations as opposed to the others. This is not what one would expect if there 
were principles favoring particular grammatical configurations independently of 
the specific source constructions or developmental mechanisms that give rise to 
them. 
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All this means that, to the extent that a principled source-oriented explanation 
is available for the occurrence of particular grammatical configurations, explana- 
tions in terms of the synchronic properties of the configuration are redundant, 
because we do not have either direct or indirect evidence for these explanations 
(see Blevins 2004 for similar arguments in phonology, and Newmeyer 2002; 2004 
for an application of this line of reasoning to optimality-theoretic models of typo- 
logical universals). Of course, one still needs to account for the fact that certain 
logically possible grammatical configurations are significantly rarer than others. 
This phenomenon, however, is logically independent of the possible motivations 
for the occurrence of the more frequent configurations, as detailed above. To 
the extent that individual grammatical configurations arise due to properties of 
particular source constructions or developmental mechanisms, any differences 
in the frequency of particular configurations will reflect differences in the fre- 
quency of the source constructions or developmental mechanisms that give rise 
to those configurations. The higher frequency of particular configurations will 
then be a result of the higher frequency of the source constructions and devel- 
opmental mechanisms that give rise to them, while the rarity of other configu- 
rations will be due to the rarity of possible source constructions or developmen- 
tal mechanisms for those configurations (see Harris 2008 for an earlier formu- 
lation of this point in regard to tripartite case marking alignment). Frequency 
differences in the occurrence of particular source constructions or developmen- 
tal mechanisms need to be accounted for, but they need not be related to any 
properties of the resulting configurations, so they should be investigated inde- 
pendently. 


5 Concluding remarks 


Source-oriented explanations of typological universals are in line with classical 
views of language change held within grammaticalization studies and historical 
linguistics in general. These views are manifested, for example, in accounts of 
the development of tense, aspect and mood systems, or alignment patterns (By- 
bee et al. 1994; Harris & Campbell 1995; Gildea 1998; Traugott & Dasher 2002, 
among others). In these accounts, grammatical change is usually not related to 
synchronic properties ofthe resulting constructions, for example the fact that the 
use of these constructions complies with some postulated principle of optimiza- 
tion of grammatical structure. Rather, grammatical change is usually a result 
of the properties of particular source constructions and the contexts in which 
they are used. In particular, new grammatical constructions recurrently emerge 
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through processes of context-induced reinterpretation of pre-existing ones, and 
their distribution originally reflects the distribution of the source constructions. 

In source-oriented explanations, the patterns captured by typological univer- 
sals originate from several distinct diachronic processes, which involve differ- 
ent source constructions and developmental mechanisms. These processes recur- 
rently take place in different languages, and are plausibly motivated by the same 
factors from one language to another. Individual patterns, however, are a com- 
bined result of the cross-linguistic frequencies of the various processes, rather 
than a result of some overarching principle independent of these processes. 

While this scenario is more complex and less homogeneous than those as- 
sumed in result-oriented explanations, it is consistent with what is known about 
the actual origins of the relevant grammatical configurations in individual lan- 
guages, and it makes it possible to address several facts not accounted for in these 
explanations. 

For example, the patterns captured by typological universals usually have ex- 
ceptions. This is in contrast with the assumption that these patterns reflect prin- 
ciples of optimization of grammatical structure that are valid for all languages, 
because in this case one has to account for why these principles are violated in 
some languages. Also, individual principles invoked in result-oriented explana- 
tions are often in contrast with some of the grammatical configurations captured 
by individual universals. For example, the idea that zero marking of more fre- 
quent meanings is motivated by economy is in contrast with the fact that these 
meanings are overtly marked in many languages. 

These facts have sometimes been dealt with in terms of competing motiva- 
tions models, but a general problem with this approach is that it may lead to 
a proliferation of explanatory principles for which no independent evidence is 
available (Newmeyer 1998: 145-153, Cristofaro 2014, among others). If the pat- 
terns captured by typological universals reflect the properties of different source 
constructions and developmental mechanisms, however, then it is natural that 
they should have exceptions, because not all languages will have the same source 
constructions, nor will particular developmental mechanisms be activated in all 
languages. Principles pertaining to the synchronic properties of the pattern will 
fail to account for all of the relevant grammatical configurations because the 
pattern is not actually motivated by those principles. 

Over the past decades, several linguists have emphasized the need for source- 
oriented explanations of typological universals (Bybee 1988; 2006; 2008; Aristar 
1991; Gildea 1998; Cristofaro 2013; 2014; 2017; Anderson 2016). This view, however, 
has not really made its way into the actual typological practice, despite the close 
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integration between typology and studies of language change (a fully fledged re- 
search approach along these lines is, on the other hand, the Evolutionary Phonol- 
ogy framework developed in Blevins 2004). While diachronic evidence about the 
origins of the patterns captured by individual universals is much scantier and less 
systematic than the synchronic evidence about these patterns, it poses specific 
foundational problems for existing result-oriented explanations of these univer- 
sals. These problems point to a new research agenda for typology, one focusing 
on what source constructions and developmental mechanisms play a role in the 
shaping of individual cross-linguistic patterns, as well as why certain source con- 
structions or developmental mechanisms are rarer than others. 


Abbreviations 


The paper conforms to the Leipzig Glossing Rules. Additional abbreviations in- 
clude: 


c common NONPL non-plural 
GL goal SRC source 
LNK linker 
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Chapter 3 


Some language universals are historical 
accidents 


Jeremy Collins 
Radboud University Nijmegen 


In this short paper, I elaborate on previous work by Givón (1971) and Aristar (1991) 
to argue that a substantial part of the well-known word-order correlations is best 
explained by grammaticalisation processes. Functional-adaptive accounts in terms 
of processing or learning constraints are currently weakly substantiated, and they 
suffer from the fact that they do not adequately control for language-internal inher- 
itance patterns. More generally, historical relatedness between different types of 
phrases constitutes an important confound in typological research, one that needs 
to be taken seriously before word-order correlations are motivated by anything 
other than the diachronic patterns that link the word order pairs in question. 


1 Introduction 


There are surprisingly few properties that all languages share. Almost every at- 
tempt at articulating a genuine language universal tends to have at least one 
exception, as documented in Evans & Levinson (2009). However, there are non- 
trivial properties that are found in if not literally all languages, enough of them 
and across multiple language families and independent areas of the world, that 
they demand an explanation. 

An example is the fact that languages have predictable word orders. If a lan- 
guage has the verb before the object, it tends to have prepositions rather than 
postpositions, as in English; if the verb is after the object, it is a good bet that the 
language will have postpositions rather than prepositions (Greenberg 1963). The 
ordering of different elements such as a possessed noun and its possessor, or a 
noun and elaborate modifiers (complex adjective phrases, relative clauses), are to 
some extent free to vary among languages, but again tend to fall into correlating 
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types (Dryer 1992; 2011). Why should knowing the word order of one category in 
a language help predict the orderings of other categories? One prominent view 
holds that these patterns reflect an innate harmonic ordering principle of Uni- 
versal Grammar, which is ultimately argued to solve the logical problem of lan- 
guage acquisition (Pinker 1994; Baker 2001; Roberts 2007). This would amount 
to what Haspelmath (2019 [this volume]) calls a “representational constraint" on 
the shape of grammars. Another possible explanation is that word-order correla- 
tions have evolved in the service of efficient language processing (e.g. Hawkins 
1994; Kirby & Hurford 1997), i.e. for functional-adaptive reasons. We find this 
view in the functional-typological literature (e.g. Dryer 1992; Evans & Levinson 
2009) as well as in computer simulations in the literature on language evolution 
(Van Everbroeck 1999). 

However, I would argue that many of these patterns are not evidence of our 
psychological preferences, but are accidental consequences of language history. 
More specifically, they are accidental in the sense that they arise as a by-product 
of grammaticalisation processes. These processes do not seem to have word- 
order correlations as a goal, nor is there good evidence for a "pull force" in that 
direction. Accordingly, grammaticalisation is an alternative to functional moti- 
vations here, and an understanding of this historical dimension is thus crucial 
to explaining word-order correlations. In this short paper, I first elaborate this 
claim (82) based on an earlier publication (Collins 2012), before I outline its con- 
sequences for typological theory and practice (83). In doing so, I am extending 
a line of argumentation by Givón (1971) and Aristar (1991), but I relate the dis- 
cussion specifically to the concerns of the present volume, and to Haspelmath's 
position paper in particular. 


2 Word-order correlations as a result of 
grammaticalisation 


Grammaticalisation is the process by which new grammatical categories can be 
formed from other (often lexical) categories. For example, Mandarin Chinese has 
a class of words which might be called prepositions from a cross-linguistic point 
of view but which clearly have their historical roots in verbs. An example is fit 
cóng, which in modern Mandarin is a preposition meaning from” but which in 
classical Chinese was a verb meaning ‘to follow’. It has lost its ability to be used as 
a full verb, requiring another verb such as ‘come’ in the sentence, just as English 
requires a verb in the sentence I come from London. Other Chinese prepositions 
such as EX gen ‘with’ also have a verbal origin, and many preposition-like words 
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such as #7 géi ‘for’ and TE zdi ‘in/at’ even retain verbal meanings (‘give’ and ‘to 


be present") and verbal syntax (such as being able to be used as the sole verb in 
the sentence and to take aspect marking). These patterns of inheritance directly 
explain why the two types of constituents (i.e. PP and VP) have the same word 
order: Prepositions and verbs were once the same category, and they simply have 
not changed their word orders since then. Since the verb precedes its NP object in 
classical and modern Chinese, its prepositional offspring in modern Chinese also 
precedes its NP complement. Interestingly, Chinese also has postpositions, such 
as li ‘in’, and these, too, are simply continuations of their lexical sources (cf. also 
Dryer 2019 [this volume]). Thus li is etymologically ‘interior’ or ‘village’, hence 
fangzi li ‘in the house’ might be glossed more literally as ‘the house's inside’. 
Again, the ordering of the younger construction as noun (fangzi)—postposition 
(li) reflects the order of the older construction with genitive (fangzi)-noun (li). 
Very similar remarks apply to Niger-Congo languages like Dagaare in Ghana, 
which also shows typologically mixed adpositional phrases (Bodomo 1997). 

More generally, the pattern of adpositions inheriting the ordering of the noun 
or verb they derive from is replicated in different language families: We find it in 
many Oceanic languages (Lynch et al. 2002: 51), where adpositions are transpar- 
ently nouns and reflect whatever ordering of genitive-noun the language has 
(hence it can be either prepositional, as in Hawaiian, or postpositional, as in 
Motu); we also see it in Indo-European languages (e.g. English across < 13 ct. 
Anglo-French an cros ‘on cross’ (Bordet & Jamet 2010: 16)), in Japanese (e.g. kara 
‘from’ < ‘way’, si restrictive particle < ‘do’ (Frellesvig 2010: 132-135)), in Aus- 
tralian languages in which adpositions are morphologically still nouns (Dixon 
2002), in Tibetan and Burmese (DeLancey 1997), and so on. Heine & Kuteva (2007: 
62) even remark that “we are not aware of any language that has not undergone 
such a process”. 

Grammaticalisation can also often explain the ordering of verb and object 
correlating with genitive and noun ordering (Dryer 2011). Certain types of verb 
phrase derive historically from noun phrases made up of a nominalised verb and 
its patient argument in a possessive construction. An example is Ewe: 


(1) Ewe (Atlantic-Congo, Gbe; Claudi 1994: 220) 
Me-le ` é-kpo dzi. 
1sG.-be.at 3sG.Poss/OBJ-see surface/on 


Tam seeing him: (lit. Tam on his seeing’). 


Ewe is normally SVO but employs the genitive-noun ordering here (his see- 
ing’), creating a construction which is SOV. Nominalisations of this kind are 
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used cross-linguistically for expressing aspect (such as the continuous aspect 
in Ewe), for subordinate clauses (expressing ‘I was surprised that he saw me’ as 
‘I was surprised at his seeing of me’ in Javanese, cf. Ogloblin 2005: 618) and for 
voice marking (in Austronesian languages, cf. Himmelmann 2005: 174). These 
verb phrases can become the most frequently used and unmarked verb phrases 
in the languages, thus the basic verb-object order of a language can evolve from 
a genitive-noun construction, even if the nominal origins of the verb form are 
no longer transparent. 

This development of (main-clause) verb phrases from nominalised verbs with 
a possessor object is again attested in very different language families, although 
it is more complicated to reconstruct. A typical example is the evolution of VOS 
ordering in Proto-Austronesian, which has been inherited by over a thousand 
Austronesian languages or evolved further into SVO or VSO (Adelaar 2005: 7). 
It is now generally accepted that verb phrases in Austronesian languages evol- 
ved from nominalising verbs, with a sentence such as "Ihe children are looking 
for the house’ deriving from a Proto-Austronesian construction of the type “The 
children are the searchers of the house’. Starosta et al. (1982) as well as Kaufman 
(2009) present several pieces of evidence in favour of this diachronic hypothesis: 
For example, the voice markers on verbs derive from nominalising morphemes, 
cognates of which still exist in Tagalog and other languages, such as the locative 
voice marker an which is also used for deriving place names (aklat-an ‘library’ < 
aklat ‘book’). Moreover, the direct object of the verb is marked with the genitive 
marker ng or put into the genitive case if a pronoun. Both nominalisation and the 
use of equational sentences of the form AB 'A is B' are extremely common in con- 
servative Austronesian languages and presumably were in Proto-Austronesian, 
allowing this frequently used construction to become a standard form of predi- 
cation. Thus the verb-object ordering in Austronesian languages derives simply 
from the noun-genitive ordering of Proto-Austronesian, which is still retained 
in these languages. At a stroke this word-order correlation is accounted for in 
roughly a sixth of the world's languages. 

As Sasse (2009: 167) notes in a comment on Kaufman (2009), the situation in 
Austronesian is "not as 'exotic' as it seemed to be at first sight, especially not 
for a Semiticist or an Afroasiaticist". He notes that the Cushitic languages also 
replaced their finite verb forms with participles and are used with dative marking 
on the agent, in effect saying ‘I have heard it’ as ‘To me was hearing’ (Sasse 2009: 
174); and that the dative pronouns eventually grammaticalised further to finite 
verbal morphology. This change also took place in the Iranian and Indo-Aryan 
languages, stretching over a large linguistic area. 
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Sasse also notes independent developments of agents marked with genitive 
case in Mayan and Inuit languages, and Gildea (1997) made a similar reconstruc- 
tion for the Cariban language family, of which the famous OVS language Hixkar- 
yana is an example: It has genitive marking on the object, effectively expressing 
'the enemy will destroy the city' as 'it will be the city's destruction by the en- 
emy' (Gildea 1997: 153), explaining among other things why the subject is placed 
last, and why it has ergative marking. One can add to this list many languages 
in Asia, as described in Yap et al. (2011), such as Tibeto-Burman languages that 
often use nominalised forms in main clauses (e.g. 'goat-killing exists’ for “he is 
killing a goat’, cf. DeLancey 2011: 349), and even Japanese, in which argument 
markers such as ga were originally genitive markers (Shinzato 2011: 461). Exam- 
ples of Niger-Congo languages such as Ewe were given earlier and are discussed 
by Claudi (1994), while Heine describes how many Nilo-Saharan and Chadic lan- 
guages render desiderative sentences in the following way: 


(2  Angas (Afro-Asiatic, Chadic; Heine 2009: 31) 
Musa rot dyip  ko-shwe. 
Musa want harvest Poss-corn 


‘Musa wants to harvest corn’ (lit. ‘Musa wants the harvesting of the 
corn.) 


The historical data thus show that these processes of grammatical change are 
not limited to individual languages or families but can instead be found much 
more widely, and independently of one another. They lead us to predict, then, 
that ultimately all correlations between the ordering of elements in verb phrases 
(V-NP), adpositional phrases (P-NP) and possessive noun phrases (GEN-NP) 
are due to direct historical connections between pairs of phrases (cf. also Croft 
2003: 77-78 for more discussion of such pairs). In the next section, I consider 
the implications of this assumption for both explanation and methodology in 
linguistic typology. 


3 Consequences for typology 


As historical evidence for the grammaticalisation account is accumulating, one 
may ask whether this makes alternative, functional-adaptive explanations in- 
valid. Recall from above that on non-nativist approaches, word-order correla- 
tions are often argued to make sentences easier or more efficient to parse in 
real time, as compared to sentences with mixed head-dependent ordering pat- 
terns (e.g. Hawkins 2004). Is it possible that these factors play a role alongside 
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grammaticalisation, such that, for example, processing demands filter out cer- 
tain difficult-to-process constructions, as Kirby & Hurford (1997) suggest (cf. also 
Christiansen 2000)? Put somewhat differently, could it not be the case that gram- 
maticalisation happens to produce orderings that are easy (or easier) to parse? 

There is currently not much evidence to substantiate this view. From a the- 
oretical perspective, there is no indication that the processes involved in gram- 
maticalisation are instigated by considerations of efficient parsing or learning. 
They happen through pragmatic inference in specific communicative contexts 
(Hopper & Traugott 2003: Ch. 4), through widespread metaphorical mappings 
(cf. Deutscher 2005: Ch. 4) and by means of chunking of repeated sequences (By- 
bee 2002). Through these mechanisms, a new construction begins to emerge that 
gradually emancipates from its original lexical source. Since it is gradual, this 
process often creates a chain of intermediate cases, such as denominal adposi- 
tions in Tibetan, some of which still require genitive marking (e.g. mdun front) 
while others have shed this marking (e.g. nang ‘inside’; cf. DeLancey 1997: 58-59). 
In other words, grammaticalisation has its origin in common non-linguistic pro- 
cesses (cf. also Bybee 2010: 6-8) and has predictable consequences, such as the 
gradual and sometimes only partial elimination of the morphology associated 
with the source. Importantly, a hallmark of grammaticalisation is syntagmatic 
"freezing" (Croft 2000: 159; cf. also Lehmann 2015[1982]: 168), so that the order 
of the elements in the new construction mirrors the order of elements in the 
source. The result is a "correlation" between the syntagmatic structure of the old 
and the new construction, but one that effectively rests on inertia rather than 
overarching processing principles that work towards a correlation. 

From a methodological perspective, processing and learning accounts are an 
example of a broader trend of the “ad hoc search for functions that match the 
universals to be explained", as Kirby (1999: 13) puts it. Attempts in the evolu- 
tionary literature to simulate processing or learning with computers in order to 
derive Greenberg's word order universals (e.g. Van Everbroeck 1999; Kirby & 
Christiansen 2003), have a particularly “just-so” flavour: All that computer simu- 
lations can do is show that processing or learning preferences of individuals can 
cause these correlations to emerge over time, all other historical factors being 
equal, not that they are actually responsible. What we would thus need is inde- 
pendent historical evidence that processing concerns do, in fact, guide historical 
change. There are some attempts to show this, for example, in earlier English (e.g. 
Fischer 1992; Clark et al. 2008), when the language appeared to converge on the 
word-order correlations after a period of freer word order. This could indeed be 
evidence for word-order correlations emerging at least in part out of processing 
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considerations; but there are other possibilities in this case which need to be in- 
vestigated further, such as it being related to the rise of analytic verb forms and 
periphrastic do, to the loss of inflections or as a result of contact from French (cf. 
also Fischer & van der Wurff 2006: 187-188 for some of the controversies). The 
historical role of processing is unclear even in this case, and there is no conclu- 
sive cross-linguistic evidence for it either. 

One possibility for establishing such causal relations cross-linguistically would 
be to look for cases of correlated evolution, i.e. situations in which a change in 
one word order can be shown to be followed by a change in another word order 
in the history of a language, or in its descendants. For example, if a language 
has verb-object order and prepositions but then changes to having object-verb 
order and postpositions, then this suggests that the two word orders are function- 
ally linked (if this event takes place after any grammaticalisation linking these 
verbs and postpositions). The only solid statistical test of this so far has been 
a widely discussed study by Dunn et al. (2011). Dunn and colleagues examined 
the ways in which four language families have developed (Bantu, Austronesian, 
Indo-European and Uto-Aztecan) and tested models of word order change using 
a Bayesian phylogenetic method for analysing correlated evolution. They found 
that some word orders do indeed change together: For example, the order of verb 
and object seems to change simultaneously with the order of adposition and noun 
in Indo-European. A model in which these two word orders are dependent is pre- 
ferred over a model in which they are independent with a Bayes factor of above 
5, a conventional threshold for significance. This seems to vindicate the idea that 
adpositions and verb-object order are functionally linked in Indo-European, and 
the pattern also holds up in Austronesian. It does not show up in the smaller 
and younger families Uto-Aztecan and Bantu, although that may be because of 
the low statistical power of this test when applied to small language families 
(cf. Croft et al. 2011). But a more important drawback is that there is no control 
for language contact. What could be happening is that some Indo-European lan- 
guages in India have different word orders because of the languages that they 
are near, such as Dravidian languages, which also have object-verb order and 
postpositions. A similar point could be made about the Austronesian languages 
that undergo word order change, which are found in a single group of Western 
Oceanic languages on the coast of New Guinea, which is otherwise dominated 
by languages with object-verb order and postpositions. 

In the context of the present discussion, an important result of Dunn et al.'s 
(2011) paper is that word orders are very stable, staying the same over tens of 
thousands of years of evolutionary time (i.e. the total amount of time over mul- 
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tiple branches of the families). In this light, it is also instructive to note that 
some typologically “mixed” or non-correlating languages show the same inert 
behaviour: Despite the fact that grammaticalisation has produced a mixture of 
prepositions and postpositions (e.g. in Chinese or Dagaare), the resulting systems 
have also survived for many generations, or even thousands of years, without 
showing any inclination to change. This, too, is a problem for processing-based 
theories, which sometimes explicitly predict that such inconsistencies should die 
out (e.g. Kirby & Hurford 1997). 

In the absence of convincing evidence for functional-adaptive motivations, I 
suggest that we accept that different types of syntactic constituents share their 
ordering patterns because they are historically related to each other, i.e. because 
they are linked by common ancestry. This also has important methodological 
consequences for typology. The kind of historical relatedness we observe here 
qualifies as a subtle, language-internal variant of Galton's problem (cf. Cysouw 
2011 for an introduction), and it is thus actually a confound in typological sam- 
ples. Just as other, more widely known, types of historical relatedness, such as 
a genealogical or areal interaction between two data points in a sample, need 
to be controlled for before one can test for a typological correlation, so does 
the language-internal historical relatedness between the grammatical patterns 
that make up that correlation. Put differently, languages in which possessor ar- 
guments are known to have developed from former object arguments and have 
simply adopted their order from this source, do not constitute an independent 
data point in support of the alleged word-order correlation. For typological prac- 
tice, this entails that we need large databases of attested grammaticalisation path- 
ways, and that we need to examine more carefully the actual markers and their 
(likely) etymologies before we set out to test a functional-adaptive hypothesis. In 
principle, it would then be possible to inspect whether certain grammaticalisa- 
tion pathways tend to be taken only in certain types of languages; for example, 
do postpositions only develop from nouns in a genitive construction (‘table’s 
head’ > “table on’) if the language also places the verb after the object? It is easy 
enough to find exceptions to that, such as Dagaare (Atlantic-Congo), which has 
taken this route to postpositions despite being a VO language (Bodomo 1997). But 
in a large database, we might still find interesting structural constraints, as well 
as geographical patterns, that could potentially speak for or against functional- 
adaptive motivations in addition to grammaticalisation. 

For now, the major point is that the historical non-independence of data points 
can create correlations that are not causal. Such spurious correlations are well- 
known from non-linguistic research (cf., e.g., the spurious correlation between 
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chocolate consumption and Nobel Prize winners; cf. also Roberts & Winters 2013 
for further discussion), and my claim in this paper is that this is a serious method- 
ological pitfall in the domain of word-order correlations. Given the naturalness 
of grammaticalisation, and the above observation that word orders tend to be pre- 
served and long retained after grammaticalisation, invoking functional-adaptive 
motivations to explain the correlations in question is not only redundant, but 
actually wrong-headed. It is as if one wanted to claim that there was a deeper 
ecological reason why chimpanzees and humans share 98.8% or so of their DNA, 
rather than just the primary historical reason, which is that they have a common 
ancestor. 

Having said this, it should be pointed out that I am neither arguing against 
functional-adaptive explanations in general, nor am I denying the relevance of 
processing to understanding word order patterns as such, including some combi- 
nations of word order that tend to be preferred over others. For example, the fact 
that VO languages strongly tend to have postnominal relative clauses is plau- 
sibly related to processing constraints (Hawkins 2004). Similarly, correlations 
between numeral-noun and adjective-noun ordering do not have a clear ex- 
planation in terms of grammaticalisation, but they do seem to be functionally 
linked and hence show interesting dependencies in experiments in artificial lan- 
guage learning (e.g. Culbertson et al. 2012; cf. also Dryer 2019 [this volume]). 
But with more and more diachronic evidence coming to light, historical links 
between many grammatical categories (VPs, auxiliaries, genitives, adpositions) 
can no longer be dismissed as marginal and as “lack[ing] generality” (Hawkins 
1983: 131). Our default assumption, then, should be that the core word-order cor- 
relations are first and foremost an accidental by-product of grammaticalisation. 

Haspelmath (2019 [this volume]) actually acknowledges this type of expla- 
nation, at least for the ordering patterns of adpositional phrases, and labels it 
a “mutational constraint" — a situation in which historical sources and gram- 
maticalisation pathways directly determine the synchronic outcomes and hence 
make functional-adaptive explanations superfluous. On the other hand, he re- 
jects “common pathways" as too weak to have explanatory power in typology. 
But how common is “common”, and when do we begin to speak of a mutational 
constraint? It is perfectly possible that common pathways (such as those docu- 
mented in Heine & Kuteva 2002; 2007), while not exhausting the possible sources 
and routes, are still frequent enough to produce a principled synchronic result. 
Therefore, I disagree with Haspelmath (p. 15) that we need not be able to un- 
derstand the diachronic patterns behind a universal tendency if there is a good 
functional-adaptive motivation available for it. In the case of word-order corre- 
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lations, and possibly other domains of grammar, it is the other way around: We 
first need to understand the diachronic links between different types of phrases 
and then control for them when we attempt to establish whether there are uni- 
versal correlations beyond historical dependencies at all. It may turn out that the 
real question is why it should ever be the case that the order of grammaticalised 
categories, such as adpositions, genitives or auxiliaries does not correlate with 
that of their source constructions. 


4 Conclusion 


Word-order correlations are often invoked as evidence for universals of language 
acquisition or language processing. In this paper, I have argued that, before we 
can do so, it is important to understand the historical background of these pat- 
terns, which standard interpretations do not take into account. Given the natural- 
ness and the non-teleological nature of grammaticalisation processes, it should 
be our default assumption that the order of grammaticalised categories retains 
the order of their respective source constructions. From this perspective, word- 
order correlations are far from mysterious and, in many cases, do not require 
functional-adaptive motivations (such as specific processing principles) or in- 
nate constraints (such as a head-ordering parameter). Instead, the correlations 
arise during the creation of new constructions by extending old constructions. 
The grammaticalisation processes involved are well-understood and ubiquitous 
(c£. Bybee 2015). And although we will never be able to have a full picture of 
the possible routes that lead to adpositions, auxiliaries, genitives, etc., the ones 
we know of seem common enough to produce the correlations in question. At 
the very least, they constitute language-internal dependencies, in Galton's spirit, 
that need to be controlled for in any typological investigation of word-order cor- 
relations, in addition to areal dependencies that hold across languages. If they 
are not, one runs the risk of erroneously inferring causation from correlation, as 
the word-order correlations would appear so strong that they require a deeper 
explanation, when in fact they are largely dependencies built into the sample. 
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Chapter 4 


Grammaticalization accounts of word 
order correlations 


Matthew S. Dryer 
University at Buffalo 


This paper examines the role that grammaticalization plays in explaining word 
order correlations. It presents some data that only grammaticalization accounts 
for, but also argues that there are correlations that grammaticalization does not 
account for. The conclusion is that accounts entirely in terms of grammaticalization 
or accounts that make no reference to grammaticalization are both inadequate. 


1 Introduction 


There is extensive literature both on identifying word order correlations (Green- 
berg 1963; Hawkins 1983; Dryer 1992) and on possible explanations for these cor- 
relations. Proposed explanations can be grouped loosely into three types. First, 
it is proposed that some correlations exist because of some sort of similarity or 
shared property of the pairs that correlate. An example of this is the hypothe- 
sis that the order of object and verb correlates with the order of adposition and 
noun phrase because both involve a pair of head and dependent. A second type 
of explanation is in terms of sentence processing (Kuno 1974; Dryer 1992; 2009; 
Hawkins 1994; 2004; 2014), under which the types that do not conform to the 
correlations are less frequent because structures containing the two inconsistent 
types are more difficult to parse. This would be what Haspelmath (2019 [this vol- 
ume]) calls a functional-adaptive type of explanation. A third line of explanation 
is in terms of grammaticalization (Givón 1979; Heine & Reh 1984: 241-244; Bybee 
1988; Aristar 1991; DeLancey 1994; Collins 2012, Collins 2019 [this volume]). For 
example it is hypothesized that the reason (or a reason) why the order of adpo- 
sition and noun phrase correlates with the order of verb and object is that one 
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grammaticalization source for adpositions is verbs and the order of verb and ob- 
ject remains the same when the verb grammaticalizes as an adposition. This line 
of explanation is thus crucially based on the diachronic sources of adpositions 
and hence a type of source-based explanation (Cristofaro 2019 [this volume]). 

Despite these competing hypotheses for explaining word order correlations, 
there is surprisingly little attempt by proponents of an explanation in terms of 
grammaticalization to argue against other approaches or by proponents of other 
approaches to argue against grammaticalization. In fact, proponents of other ap- 
proaches rarely even mention the possible role of grammaticalization. The goal of 
this paper is to argue that both explanations in terms of grammaticalization and 
explanations in terms of shared features or processing are needed in explaining 
word order correlations. I will focus on the pros and cons of grammaticalization 
explanations, largely ignoring the difference between accounts in terms of shared 
features and accounts in terms of sentence processing.! 

In 82, I discuss explanations for correlations involving order of adposition and 
noun phrase and discuss evidence that only a grammaticalization approach can 
account for. Namely I examine SVO & GenN languages that have both prepo- 
sitions and postpositions and show that not only does an approach involving 
grammaticalization predict languages with both prepositions and postpositions 
but it also correctly predicts the semantics associated with each type of adposi- 
tion. In 83, I give reasons why grammaticalization cannot account for all word or- 
der correlations, concluding that grammaticalization and other factors conspire 
to account for some correlations. And in $4, I discuss data involving word order 
properties of definiteness markers where grammaticalization seems to make the 
wrong prediction. 


2 A grammaticalization account of the correlations with 
the order of adposition and noun phrase 


In this section, I present evidence for a grammaticalization account for correla- 
tions involving the order of adposition and noun phrase that only grammatical- 
ization can account for. In $2.1, I discuss evidence of adpositions grammaticaliz- 
ing from verbs. In 82.2, I discuss evidence of a second grammaticalization source 
for adpositions, namely head nouns in genitive constructions. The next section, 


!In Dryer (1992), I argue for a processing account over an account in terms of heads and de- 
pendents for the various correlations between pairs of elements and the order of verb and 
object. 
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82.3, is, in my view, the most important section of this paper. In that section, I dis- 
cuss SVO languages which employ GenN word order. Grammaticalization theory 
predicts that in such languages, if adpositions arise from both grammaticaliza- 
tion sources discussed in $2.1 and 82.2, the language will have both prepositions 
and postpositions, the former arising from verbs, the latter from head nouns in 
genitive constructions, with particular semantics associated with each. I present 
evidence from a number of languages showing that this prediction is borne out. 


2.1 Adpositions that grammaticalize from verbs 


Let me turn now to one of the best-known word order correlations, between the 
order of object and verb and the order of adposition and noun phrase, where 
VO languages tend to have prepositions while OV languages tend to have post- 
positions (Greenberg 1963; Dryer 1992). Evidence for this correlation is given in 
Tables 1 and 2. The data for VO languages is given in Table 1. The numbers in Ta- 
ble 1 denote numbers of genera containing languages of the given sort, divided 
into five large continental areas (Dryer 1989). The more frequent type in each 
area is enclosed in square brackets. 


Table 1: Order of adposition and noun phrase in VO languages 


Africa Eurasia Oceania N.America S.America TOTAL 


VO & Po 10 6 3 5 15 39 
VO & Pr [37] [25] [48] [27] 15 152 


Table 1 shows that prepositions outnumber postpositions in VO languages by a 
wide margin in four of the five areas, with the fifth area (South America) having 
an equal number of genera containing languages with prepositions and those 
containing languages with postpositions. Overall, VO & Pr outnumbers VO & Po 
by 152 to 39 genera. 

The corresponding data for OV languages is given in Table 2. Table 2 shows 
a stronger preference for postpositions in OV languages than the preference for 
prepositions in VO languages shown in Table 1, in that Table 2 shows only 11 
genera containing OV & Pr languages while Table 1 shows 39 genera contain- 
ing VO & Po languages. I discuss an explanation for this difference in terms of 
grammaticalization in 83 below. 
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Table 2: Order of adposition and noun phrase in OV languages 


Africa Eurasia Oceania N.America S.America TOTAL 


OV & Po [25] [45] [82] [28] [41] 221 
OV & Pr 3 2 5 0 1 11 


An explanation of this correlation in terms of grammaticalization appeals to 
the fact that verbs are a common source for adpositions, so that when a verb is 
grammaticalized as an adposition, the order with the verb followed by object is 
retained as preposition followed by noun phrase, while the order with the object 
followed by verb is retained as noun phrase plus postposition. 

Two examples of this process of grammaticalization in English are the prepo- 
sitions including and concerning, as in (1). 


(1) English 


a. Four men, including John, arrived. 


b. I will talk to you later concerning your thesis. 


Both of these prepositions retain the present participle form ending in - ing, com- 
ing from the verbs include and concern. 

Grammaticalization of adpositions from verbs is common in many other lan- 
guages and widely described in the literature. The examples in (2) to (7) illustrate 
apparent examples of grammaticalization of particular semantic types of adposi- 
tions from verbs with particular meanings. 


give — for 


(2) Efik (Niger-Congo, Delta Cross: Nigeria; Givön 2001: 163) 
nam utom emi ni mi. 
do work this give me 


‘Do this work for me? 


give — to (marking addressee) 
(3) Yoruba (Niger-Congo, Defoid: Nigeria; Givón 2001: 163) 
mo sọ fún o 
I said give you 


'I said to you? 
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go — to (marking goal of motion) 
(4 Nupe (Niger-Congo, Ebira-Nupoid: Nigeria; Givón 1979: 221) 
ü bicilo dzüká. 
he ran go market 


“Heran to the market: 


follow — with (comitative) 


(5 Mandarin Chinese (Sino-Tibetan, Sinitic: China; Li & Thompson 1981: 
423) 
tà bu gen  wó jiáng-hua. 
3sc NEG follow 1sc speak-speech 
“He doesn't talk with me: 


take — object case marker 


(6) Yatye (Niger-Congo, Idomoid: Nigeria; Givón 2001: 163) 
iywi awá utsi ikü. 
boy took door shut 
"Ihe boy shut the door: 


be at — at 


(7) Mandarin Chinese (Sino-Tibetan, Sinitic: China; Yu Li, p.c.) 
tà zai guö-li chao fan. 
3sc be.at pot-in fry rice 


‘He is frying rice in the pot? 


The grammaticalization of adpositions from verbs provides a possible basis for 
an explanation of the correlation between the order of verb and object and the 
order of adposition and noun phrase. 


2.2 Adpositions that grammaticalize from head nouns in nominal 
possessive constructions 


Another common grammaticalization source for adpositions (and probably the 
more common source) is head nouns in genitive constructions. English has a 
number of prepositions that have arisen from head nouns in genitive construc- 
tions, including those in (8). 
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(8) English 
a. in the side of NP — inside NP 
b. by the side of NP — beside NP 
c. by the cause of NP — because of NP 


Because these adpositions arose from head nouns in a genitive construction with 
NGen order, they ended up as prepositions rather than postpositions. The oppo- 
site situation arose in Amharic, where the examples in (9) illustrate two postpo- 
sitions arising from head nouns in a GenN construction. 


(9) Amharic (Afro-Asiatic, Semitic: Ethiopia; Givón 1971: 399) 


a. NP + bottom — NP + under 
kä-bet tač alla. 
at-house bottom is 
“He is under the house. 

b. NP + reason — NP + because of 
bä-issu mikniyat näw. 
at-he reason is 


‘It is because of him? 


This type of grammaticalization of adpositions from head nouns in genitive 
constructions would explain the correlation between the order of noun and geni- 
tive and the order of adposition and noun phrase. The data in Tables 3 and 4 pro- 
vides evidence for this correlation. The data in Table 3 shows GenN languages 
being overwhelmingly postpositional, while the data in Table 4 shows NGen lan- 
guages being overwhelmingly prepositional. 


Table 3: Order of adposition and noun phrase in GenN languages 


Africa Euras Oceania N.Amer S.Amer TOTAL 


GenN & Po [31] [50] [73] [33] [54] 241 
GenN & Pr 2 7 13 3 4 29 
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Table 4: Order of adposition and noun phrase in NGen languages 


Africa Euras Oceania N.Amer S.Amer TOTAL 


NGen & Po 7 0 6 0 1 14 
NGen & Pr [34] [19] [29] [19] [10] 111 


2.3 An interesting prediction of grammaticalization accounts for 
adpositions 


Sections $2.1 and 82.2 illustrate two grammaticalization sources for adpositions, 
one from verbs, the other from head nouns in genitive constructions. In lan- 
guages which are VO and NGen, both sources will lead to prepositions rather 
than postpositions. Conversely, in languages which are OV and GenN, both sour- 
ces will lead to postpositions. But there are many languages which are VO but 
GenN. Dryer (1997; 2013) shows that although OV languages tend to be GenN 
and verb-initial languages tend to be NGen, both orders of noun and genitive are 
common among SVO languages, as shown in Table 5. 


Table 5: Order of genitive and noun in SVO languages 


Africa Euras Oceania N.Amer S.Amer TOTAL 


SVO & GenN 1 11 13 2 [11] 48 
SVO & NGen [41] [16] 13 [5] 3 78 


Table 5 shows that NGen is more common overall than GenN among SVO lan- 
guages by 78 genera to 48. However, the higher number of genera containing 
SVO & NGen languages turns out to be entirely due to languages in Africa. Out- 
side Africa, SVO & NGen and SVO & GenN are both found in exactly 37 genera. 
The general conclusion is that there is no evidence of any preference for NGen 
order over GenN order among SVO languages. 

Because of the two grammaticalization sources for adpositions described in 
the two preceding sections, grammaticalization theory makes an interesting pre- 
diction about SVO & GenN languages. Namely, if adpositions arise in any such 
languages from both grammaticalization sources, the language should have both 
prepositions and postpositions, those arising from verbs being prepositions and 
those arising from nouns being postpositions. The evidence in this section shows 
that this prediction is borne out. 
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In fact, grammaticalization theory makes more specific predictions about what 
meanings will be associated with prepositions and what meanings will be as- 
sociated with postpositions in such languages. The lefthand column in Table 6 
summarizes typical meanings associated with adpositions that arise from verbs, 
while the righthand column summarizes typical meanings associated with adpo- 
sitions that arise from head nouns in genitive constructions. 


Table 6: Typical meanings associated with adpositions 


Typical meanings associated with Typical meanings associated with 
adpositions that come from verbs adpositions that come from nouns 
benefactive (‘for’) specific locations like 

instrumental (^with") ‘under’, ‘behind’, “in front of’ 
comitative (‘with’) "because of’ 


similative ( like") 

allative (to, toward’) 

general locations (‘at’) 

adpositions marking direct objects 


Note that it is typically adpositions denoting specific locations that arise from 
nouns; adpositions denoting general locations (meaning something like ‘at’) of- 
ten arise from verbs. Similarly adpositions associated with motion away from 
source or towards a location also more often arise from verbs. Grammaticaliza- 
tion theory predicts that in an SVO & GenN language with both prepositions 
and postpositions, the prepositions will tend to have meanings like those in the 
lefthand column in Table 6, while the postpositions will tend to have meanings 
like those in the righthand column. This section shows that these predictions are 
also borne out. 

The first language illustrating how these predictions are borne out is Nluuki. 
The SVO order of Nluuki is illustrated in (10), GenN order in (11). 


(10) Nluuki (Tuu: South Africa; Collins & Namaseb 2011: 10) 
tharuxuke Ai Ooe. 
Haruxu DECL eat meat 


‘Haruxu is eating meat: 
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(11) Nluuki (Tuu: South Africa; Collins & Namaseb 2011: 37) 
siso tona 
Siso knife 


'Siso's knife’ 
Nluuki has both prepositions and postpositions. Examples illustrating the prepo- 


sition rla are given in (12) and (13), (12) illustrating an instrumental use, (13) a 
comitative use. 


(12) Nluuki (Tuu: South Africa; Collins & Namaseb 2011: 25) 
n-a si laa Ooe gla ytona. 
1SG-DECL IRR cut meat with knife 


‘I will cut the meat with a knife: 


(13) Nluuki (Tuu: South Africa; Collins & Namaseb 2011: 25) 
lalate ke siisen gla plaggusi. 
lalafe DECL work with Nlangusi 


‘lala‘e works with Nlaggusi: 
In contrast, example (14) illustrates a postposition xuu ‘in front of”. 


(14) Nluuki (Tuu: South Africa; Collins & Namaseb 2011: 80) 
besi I?aa süi lopa xuu 
necklace go sit.down child front 


"Ihe necklace fell in front of the child. 


The prepositions and postpositions in Nluuki (Collins & Namaseb 2011: 24-25) 
are listed in Table 7. 


Table 7: Prepositions and postpositions in Nluuki 


Prepositions Postpositions 

pla instrumental, comitative’ la?è in’ 

la like’ xuu in front of 
yn linker’ tszii behind’ 


Ighaa ‘next to’ 


Two of the three prepositions, nla ‘instrumental, comitative’ and lla ‘like’, con- 
form to the semantic types of adpositions arising from verbs and the fact that they 
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are prepositions rather than postpositions can be explained if they have arisen 
from verbs in a VO language. And all four of the postpositions represent specific 
locations, conforming to what we expect semantically of adpositions arising from 
head nouns in genitive constructions; the fact that they are postpositions rather 
than prepositions can be explained in that they have arisen from head nouns in 
a genitive construction in a GenN language. 

A second example is provided by Logba. Like Nluuki, Logba is SVO, as illus- 
trated in (15), and GenN, as in (16). 


(15) Logba (Niger-Congo, Kwa: Ghana; Dorvlo 2008: 105) 
Setoró-kpe i-gbedi-é. 
Setor sG-peel Nc-cassava-DET 


‘Setor peeled the cassava: 


(16) Logba (Niger-Congo, Kwa: Ghana; Dorvlo 2008: 71) 
Kodzo a-klo-a 
Kodzo Nc-goat-DET 


‘Kodzo’s goat’ 


Also like Nluuki, Logba has both prepositions and postpositions. The preposition 
kpe with instrumental or comitative meaning is illustrated in (17). 


(17) Logba (Niger-Congo, Kwa: Ghana; Dorvlo 2008: 96) 
Udzi=é ó-gle uzugbo kpe a-futa. 
woman=DET sG-tie head with nc-cloth 
"Ihe lady tied her head with a cloth’ 


In contrast, an example illustrating a postposition etsi ‘under’ is given in (18). 


(18) Logba (Niger-Congo, Kwa: Ghana; Dorvlo 2008: 98) 
i-dato=a i-tsi a-füta-á etsi. 
NC-spoon- DET sc-be.in Nc-cloth=DET under 


"Ihe spoon is under the cloth: 


In Table 8 is a list of the prepositions and postpositions of Logba (Dorvlo 2008: 
95, 98). While one of the prepositions has a meaning more commonly associated 
with adpositions that arise from nouns (na ‘on’), the other prepositions all have 
meanings that grammaticalization theory predicts for adpositions arising from 
verbs and all the postpositions have meanings involving specific locations, the 
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Table 8: Prepositions and postpositions of Logba 


Prepositions Postpositions 

fe ‘at’ nu ‘inside’ 

na ‘on’ etsi ‘under’ 

kpe ‘instrumental, comitative tsü ‘on’ 

gu ‘about’ ité ‘in front of’ 

dzigu from’ zugbö ‘on’ 
yó 'surface contact' (e.g. on a wall) 
anü “at tip of, at edge of’ 
otsoe ‘on the side of’ 
ama behind’ 


types of meanings that grammaticalization predicts for adpositions that arise 
from nouns. 

The third SVO & GenN language with both prepositions and postpositions 
is Eastern Kayah Li, a Karenic language in the Sino-Tibetan family spoken in 
Myanmar and Thailand. The prepositions and postpositions of Eastern Kayah Li 
are listed in Table 9 (Solnit 1997: 209-214). Apart from three prepositions with 
unusual meanings (‘as much as’, ‘as big as’, ‘as long as’), the rest of the prepo- 
sitions and all of the postpositions have meanings conforming to the semantics 
typically associated with adpositions arising from verbs and adpositions arising 
from head nouns in genitive constructions respectively. 

The fourth SVO & GenN language exhibiting a similar pattern is Jabem, an 
Oceanic language in the Austronesian family spoken in Papua New Guinea. In 
Table 10 is a list of the prepositions and postpositions of Jabem (Dempwolff 1939; 
Bradshaw & Czobor 2005: 42-44; Ross 2002: 291). While all the postpositions 
again have meanings denoting specific locations, as we would expect of adposi- 
tions arising from head nouns in genitive constructions, three of the prepositions 
also have meanings of that sort (‘next to’, ‘close to’). In fact, Dempwolff specifi- 
cally suggests that these prepositions arose from verbs (suggesting, for example, 
that tamin ‘next to’ comes from a verb meaning ‘to be close upon’).? 


?] base this on Bradshaw & Czobor's (2005) English translation of Dempwolff (1939). 
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Table 9: Prepositions and postpositions of Eastern Kayah Li 


Prepositions Postpositions 
dy ‘at’ kū ‘inside’ 
mú ‘at’ klo ‘outside’ 
by ‘at’ khu ‘on, above’ 
bä “as much as’ ke - kedé “down inside’ 
ti ‘as big as’ kha ‘at apex of 
ty~thy ‘as long as’ lé ‘under, downhill from’ 
phi ~ hi like’ cha ‘near’ 
né ~ bésené ‘in front of’ 
khjä ~ békhjà ‘behind’ 
lo ‘on non-horizontal surface’ 
kla “in (an area) 
rokle ‘beside’ 
ple ~ ple ka ‘in narrow space between’ 
cokü “in middle of, between’ 
thu ‘on edge of’ 
tokjà “in the direction of’ 


Table 10: Prepositions and postpositions of Jabem 


Prepositions 


Postpositions 


tamir ‘next to, onto’ 


ban ‘close to’ 

pay ‘close to’ 

ya ‘instrumental’ 
alen from’ 
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lélóm inside’ 

lólóc ‘on top of’ 
labu ‘under’ 

sawa ‘between’ 
lun ‘in middle of’ 


nem in front of? 
mu ‘behind’ 
gala ‘near’ 


tali ‘at edge of 


4 Grammaticalization accounts of word order correlations 


In Table 11 to 16 are lists of prepositions and postpositions from six other SVO 
& GenN languages that have both. All show patterns similar to those in the four 
languages described above in this section, with the prepositions having mean- 
ings associated with adpositions arising from verbs and the postpositions with 
meanings associated with adpositions arising from nouns. 


Table 11: Hoá (Kxa: Botswana, Collins & Gruber 2014: 101-105) 


Prepositions Postpositions 

ki linker’ na ‘in’ 

ke ‘comitative’ za ‘by, beside’ 
llg'am ‘above 
tka ‘below’ 
Phàá in front of 


“ € H 
kya^m ‘near 


Table 12: Koromfe (Niger-Congo, Gur: Burkina Faso, Mali, Rennison 


1997; 2017) 
Prepositions Postpositions 
la ‘instrumental, comitative’ ne ‘benefactive, purpose, about’ 
hal ‘until’ kana ‘like’ 


doba ‘on top of’ 
heraga ‘beside, near’ 
hogo under’ 
jikane ‘in front of’ 


joro ‘in, inside’ 
belle ‘behind’ 
tulle ‘in the middle of, between’ 
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Table 13: Mandarin Chinese (Sino-Tibetan, Sinitic: China, Li & Thomp- 


son 1981) 
Prepositions (or coverbs) Postpositions (or locative particles) 
gen with (comitative)’ shäng ‘on top of, above’ 
géi ‘for’ (benefactive) xià “below” 
ba object marker lí “in, inside’ 
dui = ‘toward’ wai ‘outside’ 
cong ‘from’ qian ‘in front of’ 
zài ‘at’ hou ‘behind’ 
ti ‘instead of pang ‘beside’ 
bèi by dongbu ‘east of’ 
an ‘according to’ zher “this side of’ 
dào to’ qián ‘in front of’ 
hóu ‘behind’ 
pang ‘beside’ 
zhöngjian ‘in the centre of’ 
Table 14: Koyra Chiini (Songhay: Mali, Heath 1999: 104-109) 
Prepositions Postpositions 
nda ‘comitative, instrumental’ se ‘dative’ 
bilaa ‘without’ ra ‘locative’ 
hal until’ ga "beside, from' 
jaa since’ doo ‘at the place of’ 
bara ‘except’ banda behind’ 
kala ‘except’ beene above’ 
čire ‘under’ 
kuna in’ 
jere ‘beside’ 
jine ‘in front of’ 
maasu  'inside' 
tenje ‘facing’ 
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Table 15: Taba (Austronesian, South Halmahera: Indonesia, Bowden 
2001: 109-111) 


Prepositions Postposition 
ada ‘comitative, instrumental’ li ‘on, in, at’? 
pake instrumental’ 

untuk  'benefactive' 

lo ‘like’ 


“The fact that the one postposition in Taba has general locative meaning does not fit the ex- 

pectations for a postposition in a GenN language. But the fact that it is locative while the 
prepositions are not does fit loosely. It is possible that it originally had a narrower locative 
meaning that has become bleached. 


Table 16: Dagbani (Niger-Congo, Gur: Ghana, Olawsky 1999) 


Prepositions Postpositions 
ni 'comitative, instrumental’ nyaana ‘behind’ 
D D D DH E € 3 
jendi ‘about, concerning zuyu on top of 
gbinni ‘under’ 
D D 3 
sani towards 


sunsuuni ‘in the middle of’ 


ni “in, at, to’ 

puuni ‘inside’ 

polo ‘in the direction of’ 
lonni ‘under’ 
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The languages illustrated in Table 7 to Table 16 above are instances of SVO 
languages with GenN order and both prepositions and postpositions. Though less 
common, there are also languages of the opposite sort, OV languages with NGen 
order and both prepositions and postpositions, where the semantics associated 
with prepositions and postpositions respectively is the opposite of that found 
in SVO & GenN languages. An example is Iraqw. Example (19) illustrates the 
preposition daandá ‘behind’. That it has grammaticalized from the head noun in 
a genitive construction is clear from the fact that it occurs in construct state, the 
morphological form that head nouns take in genitive constructions. 


(19) Iraqw (Afro-Asiatic, Cushitic: Tanzania; Mous 1993: 97) 
looai daandü hunkáy. 
sun 3sBJ behind.constr cloud 


"Ihe sun is behind the cloud. 


In contrast, example (20) illustrates a postpositional clitic =i ‘directional’ that 
attaches to the last word in the noun phrase. In (20) it attaches to the noun do” 
‘house’, the possessor of afku ‘mouth’ (‘door’), but it is marking the entire noun 
phrase afkü do? “mouth (door) of the house’ as the goal of the motion denoted by 
the verb qaas ‘put’. 


(20) Iraqw (Afro-Asiatic, Cushitic: Tanzania; Mous 1993: 252) 
famfe'amo u-n af-kü do'-i qaas-áan. 
snake MASC.OBJ-EXPEC mouth-CONSTR.MASC house-DIR put-1PL 


‘Let us put a snake on the door of the house: 


In Table 17 is a list of prepositions and postpositions in Iraqw (Mous 1993: 
95-107). Setting aside momentarily the first three prepositions in Table 17, the 
semantics associated with the prepositions and postpositions in Iraqw is the re- 
verse of what we found in (10) to (18) for SVO & GenN languages. Namely, in 
Table 17, it is the prepositions which denote specific locations, while the post- 
positions have meanings that are generally associated with adpositions arising 
from verbs. 

The first three prepositions in Table 17 have the same meanings as the first 
three postpositions in the table. Their meanings are thus ones that we might have 
expected to be associated with postpositions in an OV language. These preposi- 
tions take the form of /a/ plus the corresponding postpositional clitics. Mous 
(1993: 102) speculates that the /a/ in these forms may have originally been the 
copula a. It is possible that these prepositions have arisen by analogy to other 
prepositions in the language. 
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Table 17: Prepositions and postpositions in Iraqw 


Prepositions Postpositions 

ar ‘instrumental’ =(a)r ‘instrumental, comitative’ 

as ‘because of’ =sa ‘because of’ 
€ > D D > 

ay to =1 to 

dir ‘to’ =wa ‘from’ 

amor ‘at’ 

daandü ‘on’ 

ala ‘behind’ 

guruu ‘inside’ 

gamu ‘under’ 

bihháa ‘beside’ 

tla‘a(ng) ‘between’ 

tsee‘ä ‘outside’ 

afíqoomár ‘until’ 

gawa ‘on’ 

geerá “before 

afa ‘at the edge of’ 

bará in 


A second instance of an OV & NGen language with both prepositions and post- 
positions is Kanuri. Example (21) illustrates the locative-instrumental postposi- 
tional clitic =lan attaching to a postnominal modifier Musa=be ‘Musa’s’, marking 
for Musa-be ‘Musa’s horse’ as an instrumental.” 


(21) Kanuri (Saharan: Nigeria, Niger; Hutchison 1976: 5) 


[far 


Musa-be]-lan  kadio. 


[horse Musa=GEN]=INS come.PST.3SG 


‘He came on/by Musa's horse! 


Kanuri also has prepositions, like suro 'inside' in (22). 


“There are thus two postpositional clitics in the phonological word Musa-be-lan in Table 17, 
the =be marking Musa as possessor of far ‘horse’ and the =lan marking far Musa-be ‘Musa’s 
horse' as an instrumental. 
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(22) Kanuri (Saharan: Nigeria, Niger; Hutchison 1976: 80) 
suro fato-be-ro kargawo. 
inside house-GEN-to enter.PsT.3sc 


*He went into the house: 


Note that suro retains its nominal nature in (22), in that its complement fato 
‘house’ is marked as a possessor, with the genitive postpositional clitic =be, and 
the entire phrase marked with the postpositional clitic =ro ‘to’, so that (22) could 
be glossed as 'He went to the inside of the house'. To what extent these locational 
nouns have grammaticalized as prepositions is not clear. Even if they have not 
grammaticalized much yet, they illustrate how an OV & NGen language could 
acquire prepositions. 

In Table 18 is a list of prepositions and postpositions of Kanuri (Hutchison 1981: 
257-263). 


Table 18: Prepositions and postpositions of Kanuri 


Prepositions Postpositions 

bótówó ‘next to’ -(làn locative, instrumental’ 
ci ‘at edge of’ =rò ‘benefactive, indirect object, to’ 
daryé ‘at the end of’ =mben ‘through, towards’ 
dawt ‘in middle of’ 

fawt ‘in front of’ 

farta ‘at base of 

goré ‘next to’ 

käte ‘between’ 

kalä ‘on top of’ 

ngawö ‘behind, after’ 


sadia ~ cídíà ‘under’ 
ra ^ Ce D D SS 
süró inside, during 


The meanings associated with the prepositions in Kanuri are similar to those of 
the prepositions in Iraqw, but are also similar to the meanings of the postposi- 
tions in the various SVO & GenN languages discussed above. Conversely, the 
meanings associated with the postpositions in Kanuri are similar to those of the 
postpositions in Iraqw and also similar to the meanings of the prepositions in 
the various SVO & GenN languages discussed above. 
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There is another instance of a language with both prepositions and postpo- 
sitions that provides an interesting variation of the argument in this section, 
namely English. While English is predominantly a prepositional language, it has 
at least two postpositions, ago and notwithstanding, as in (23). 


(23) English 
a. I saw him three weeks ago. 


b. I went to the concert, the doctor's advice notwithstanding. 


What is unusual about these two postpositions in English is that although both 
are apparently grammaticalizations of verbs, they are ones where what is now 
the object of that postposition was originally the subject of the verb (rather than 
the object, the more common situation with grammaticalizations from verbs). 
According to the Merriam Webster online dictionary,” ago comes from an obso- 
lete verb meaning 'pass' so that three weeks ago derives from three weeks have 
passed, where three weeks was originally the subject of this verb. And notwith- 
standing comes from not plus a form of the verb meaning ‘withstand’ in the sense 
of providing an obstacle for’; again, what is now the object of the postposition 
notwithstanding was originally the subject of the verbal expression. The fact that 
these two words arose as postpositions rather than as prepositions reflects the 
fact that subjects normally preceded the verb, even in earlier varieties of English 
when word order was more flexible. Again, only a grammaticalization account 
explains these. 

The evidence in this section involves data that only grammaticalization can 
explain. An explanation in terms of grammaticalization for the correlation be- 
tween the order of verb and object and order of adposition and noun phrase 
as well as the correlation between the order of noun and genitive and order of 
adposition and noun phrase predicts that we should find both prepositions and 
postpositions in the same language where the former derive from verbs and the 
latter from head nouns in genitive constructions, as well as predicting the seman- 
tic differences between the two types of adposition. The evidence in this section 
shows how these predictions are borne out. There is no obvious way in which 
accounts in terms of processing or similarity could explain this data. 


^ Notwithstanding also occurs as a preposition. The postpositional use is apparently the original 
use. I suspect that the use as a preposition arose due to its semantic similarity to another 
preposition despite. 

Shttps://www.merriam- webster.com/dictionary 
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3 What grammaticalization does not explain 


The preceding section provides evidence that grammaticalization explains, at 
least partly, the correlation between the order of verb and object and order of 
adposition and noun phrase as well as the correlation between the order of noun 
and genitive and order of adposition and noun phrase. In this section, I discuss 
the question whether grammaticalization fully explains word order correlations 
and argue that it does not. I first discuss word order correlations for which there 
does not seem to be any good explanation in terms of grammaticalization. Ta- 
ble 19 provides a list of pairs of elements that are shown by Dryer (1992) to corre- 
late with the order of verb and object, where the verb patterner refers to elements 
that occur first in these pairs more often among VO languages than among OV 
languages (and where the object patterner refers to the other member of the pair). 


Table 19: Pairs of elements that correlate with the order of verb and 


object 
Verb patterner Object patterner Example 
verb adpositional phrase slept + on the floor 
verb manner adverb ran * slowly 
copula verb predicate is + a teacher 
“want VP wants + to see Mary 
noun relative clause movies + that we saw 
adjective standard of comparison ` taller + than Bob 
complementizer clause that + John is sick 
question particle sentence 
adverbial subordinator clause because + Bob has left 


For none of these pairs of elements that correlate with the order of verb and 
object is there a convincing explanation in terms of grammaticalization. For ex- 
ample, the order of verb and adpositional phrase most likely correlates with the 
order of verb and object because of semantic similarities between these two pairs 
of elements or because of processing factors. It is hard to imagine an explanation 
in terms of grammaticalization for this correlation. 

I devote the remainder of this section to discussing the correlation between 
the order of verb and object and the order of noun and genitive. While there 
have been attempts to explain this correlation in terms of grammaticalization, 
I claim here that such attempts fall short of providing a plausible explanation. 
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A good summary of this approach is provided by Collins (2019 [this volume]). 
However, most of the cases discussed by Collins are highly speculative, espe- 
cially compared to the evidence for adpositions deriving from verbs or nouns. 
The arguments involve cases where the constructions now used for main clauses 
are claimed to have originated from nominalizations (where a construction like 
John's seeing Peter is claimed to have replaced an existing finite construction like 
John saw Peter). Assuming that the word order in nominalizations reflects the 
order of noun and genitive (an assumption that is probably valid), the new con- 
struction will employ an order of verb and object that reflects the order of noun 
and genitive. 

While there probably have been some instances in which a nominalization 
construction came to be used as the primary construction for main clauses, there 
is little evidence of this in most families and the correlation between the order 
of verb and object and the order of noun and genitive seems far too strong to 
be explained purely in this way. Consider the data in Table 20 on the relative 
frequency of the different orders of noun and genitive in OV languages. 

Table 20 shows that GenN order outnumbers NGen by 247 to 25 genera, a ratio 
of almost 10-to-1. The evidence for nominalizations coming to be used as main 
clauses is far too meagre to account for such a strong correlation. 

It should be noted that the order of noun and genitive correlates with the 
order of verb and object less strongly than the order of adposition and noun 
phrase correlates with either the order of verb and object or the order of noun 


$Some of Collins' arguments are particularly unconvincing. He cites data from Angas show- 
ing nominalizations being used for complements of the verb meaning ‘want’. But this only 
shows that some languages express such complements using nominalizations; it provides no 
evidence of nominalizations coming to be used as main clauses. He also cites the large number 
of Austronesian languages as evidence for the frequency by which nominalizations become 
main clauses. But quite apart from the fact that Collins provides no evidence to support his 
claim that it is generally accepted that nominalizations came to be used as main clauses in 
Austronesian, the size of the family is not relevant; what is relevant is the number of instances 
of changes of this sort. A number of proposals that main clause constructions originated as 
nominalizations are based largely on the fact that the same case marker is used for both pos- 
sessors and subjects (or transitive subjects). But there are many ways by which this can arise 
without nominalizations coming to be used as main clauses. 

"It will also determine the order of verb and subject, especially for intransitive verbs. There 
are issues arising here that are beyond the scope of this paper. And while I find the evidence 
that grammaticalization explains the correlation between the order of verb and object and the 
order of noun and genitive unconvincing, I must concede that it would account for the large 
number of SVO & GenN languages. In other words, it would account for the fact that the order 
of noun and genitive is one of the few orders that correlates not only with the order of verb 
and object but also with the order of verb and subject (Dryer 2013). 
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Table 20: Order of noun and genitive in OV languages 


Africa Euras Oceania N.Amer S.Amer TOTAL 


OV & GenN [26] [46] [87] [34] [54] 247 
OV & NGen 13 1 10 0 1 25 


and genitive: Tables 1 and 2 above show a particularly strong correlation between 
the order of verb and object and the order of adposition and noun phrase; Tables 3 
and 4 show an even stronger correlation between the order of noun and genitive 
and the order of adposition and noun phrase. But the large number of SVO & 
GenN languages shows that the correlation between the order of verb and object 
and the order of noun and genitive is less strong. 

One possible explanation for why the correlation between the order of verb 
and object and the order of noun and genitive is weaker is that all three of these 
correlations are due in part to factors other than grammaticalization (such as 
the processing explanations of Dryer 1992 and Hawkins 1994; 2004; 2014), but 
that grammaticalization augments the correlation between the order of verb and 
object and the order of adposition and noun phrase as well as the correlation 
between the order of noun and genitive and the order of adposition and noun 
phrase. In other words, it may be a mistake to try to choose between grammat- 
icalization and other factors in explaining word order correlations; they may 
conspire to lead to these stronger correlations. 

In fact, data presented by Dryer (1992; 2013) suggests that the correlation be- 
tween the order of verb and object and the order of adposition and noun phrase 
as wellas the correlation between the order of noun and genitive and the order of 
adposition and noun phrase are stronger than most of the correlations in Table 19 
above. Since there do not appear to be promising explanations for those correla- 
tionsin terms of grammaticalization, the fact that the two correlations involving 
adpositions are particularly strong suggests again that both grammaticalization 
and other factors play a role in explaining those correlations. 

Note also that grammaticalization explains the fact mentioned above in 82.1 
that the preference for postpositions among OV languages is stronger than the 
preference for prepositions among VO languages. Namely, OV languages are 
overwhelmingly GenN so that both sources for adpositions lead to postpositions 
in OV languages. In contrast there are many SVO languages with GenN order. In 
such languages the adpositions derived from head nouns will be postpositions, 
so that (assuming some such languages lack adpositions derived from verbs) we 
expect to find SVO languages with postpositions. 
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4 Order of noun and definiteness marker 


In this section, I discuss a different type of problem for grammaticalization ac- 
counts of word order correlations. In the cases discussed in $3, grammaticaliza- 
tion simply fails to predict a word order correlations which can be shown to be 
real. In the case discussed in this section, grammaticalization makes a prediction 
that turns out not to hold, involving the order of definiteness marker and noun. 

The most common grammaticalization source for markers of definiteness ap- 
pears to be demonstratives. In fact my database contains 102 instances of lan- 
guages that use demonstratives as markers of definiteness, compared to 274 lan- 
guages with markers of definiteness that are distinct from demonstratives. Both 
the order of definiteness marker and noun and the order of demonstrative and 
noun exhibit weak correlations with the order of verb and object, but what is 
surprising from the perspective of grammaticalization is that they exhibit oppo- 
site correlations. Namely, definiteness markers precede the noun more often in 
VO languages than in OV languages, while demonstratives follow the noun more 
often in VO languages than in OV languages. 

Consider first definiteness markers in VO languages. Table 21 provides data 
on the order of definiteness marker and noun in VO languages. The last line in 
Table 21 gives the proportion of the number on the first line as a proportion 
of the sum of the number on the first line and the number on the second line. 
For example, the .21 on the third line in Table 21 under Africa represents 8 asa 
proportion of 39 (the sum of 8 and 31). I use these proportions in the discussion 


below. 
Table 21: Order of noun and definiteness marker in VO languages 
Africa Euras Oceania N.Amer S.Amer TOTAL 
VO & DefN 8 [11] [16] [17] [7] 59 
VO & NDef [31] 3 13 8 0 55 
Proportion DefN .21 KA Di .68 1.00 x=.64 


Table 21 shows the two orders of definiteness marker and noun to be about 
equally common among VO languages, with DefN order found in languages in 
59 genera and NDef order found in languages in 55 genera. This is a case, how- 
ever, where the total numbers of genera are somewhat misleading, since one area, 
Africa, exhibits a very different pattern from what we find in the other four ar- 
eas. In Africa, genera containing VO languages in which the definiteness marker 
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follows the noun outnumber genera containing VO languages in which the defi- 
niteness marker precedes the noun by 31 to 8. In the other four areas, in contrast, 
it is more common among VO languages for the definiteness marker to precede 
the noun; in fact, in three of the areas (Eurasia, North America, and South Amer- 
ica), DefN order is more than twice as common as NDef order. The mean of the 
proportions over the five areas, namely .64, also reflects a preference for DefN 
order among VO languages. Another way to see this is that if we exclude Africa, 
DefN outnumbers NDef among VO languages by 51 to 24.? 

Table 22 provides comparable data on the order of definiteness marker and 
noun among OV languages. We again find only a small difference, though it is 
NDef that outnumbers DefN among OV languages, by 53 genera to 38. 


Table 22: Order of noun and definiteness marker in OV languages 


Africa Euras Oceania N.Amer S.Amer TOTAL 


OV & DefN 3 [9] 15 4 [7] 38 
OV & NDef [12] 5 [23] [9] 4 53 
Proportion DefN .20 .64 .39 31 .64 x=.44 


But what is revealing is to compare the proportions from the last lines of Tables 
21 and 22, given in Table 23. 


Table 23: Proportion of genera containing DefN languages among VO 
vs. OV languages 


Africa Eurasia Oceania N.America S.America Mean 


VO [.21] [.79] [.55] [.68] [1.00] 64 
OV 20 64 39 31 64 =.44 


Here we find that although the margin of difference in Africa is very small, it is 
still the case that the proportion of genera containing DefN languages is greater 
among VO languages in all five areas. This gives us reason to conclude that there 


*The higher preference for NDef order among VO languages in Africa reflects a general differ- 
ence between Africa and the rest of the world in that postnominal modifiers are more com- 
mon in Africa than elsewhere (Dryer 2010). Table 20 above shows a similar difference between 
Africa and the rest of the world: while GenN outnumbers NGen among OV languages overall 
by almost 10-to-1, the ratio in Africa is only 2-to-1 and over half (13 out of 25) of the genera 
containing OV & NGen languages are in Africa. 
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is a correlation, albeit a weak one, between the order of verb and object and the 
order of definiteness marker and noun, with the definiteness marker preceding 
the noun more often among VO languages than among OV languages. 

Given the fact that the most common grammaticalization source for definite- 
ness markers appears to be demonstratives, we might expect to find a similar 
correlation between the order of verb and object and the order of demonstrative 
and noun. We do find a clear trend, but it is the opposite correlation. Namely 
while definiteness markers precede the noun more often among VO languages 
compared to OV languages, demonstratives tend to follow the noun more often 
among VO languages compared to OV languages. 

Tables 24 to 26 provide data supporting this. Table 24 provides relevant data 
for VO languages. It shows that although NDem order is slightly more common 
than DemN order, by 118 genera to 92, this order is more common in only three 
of the five areas (and in fact, if we exclude Africa, it is DemN order that is more 
common among VO languages, by 84 genera to 66). 


Table 24: Order of noun and demonstrative in VO languages 


Africa Euras Oceania N.Amer S.Amer TOTAL 


VO & DemN 8 12 24 [24] [24] 92 
VO & NDem [52] [16] [31] 12 7 118 
Proportion DemN 43 .43 44 .67 77 X-.49 


However, Table 25 shows that among OV languages, DemN order is about twice 
as common as NDem order, by 181 genera to 95, although there are two areas 
where NDem is more common among OV languages. 


Table 25: Order of noun and demonstrative in OV languages 


Africa Euras Oceania N.Amer S.Amer TOTAL 


OV & DemN 16 [44] 45 [30] [46] 181 
OV & NDem [18] 6 [57] 6 8 95 
Proportion DemN .47 .88 44 .83 .85 x=.70 


Again, it is useful to compare the proportions from the last lines of Tables 24 
and 25, shown in Table 26. 
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Table 26: Proportion of genera containing DemN languages among VO 
vs. OV languages 


Africa Eurasia Oceania N.America S.America Mean 


VO 13 43 44 67 37 49 
OV [.43] [.88] 44 [.83] [.85] 70 


Table 26 shows that the proportion of genera containing DemN languages is 
higher among OV languages in four areas while the proportion is the same in 
the fifth area (Oceania).? There is thus a clear trend in the opposite direction 
from what we found for the order of definiteness marker and noun. Given that 
the most common grammaticalization source for definiteness markers appears 
to be demonstratives, this contrast is quite surprising. 

I have no explanation for the source of this difference between definiteness 
markers and demonstratives. But I will share some interesting data from partic- 
ular languages that conforms to this difference. First, there are a few languages 
in which the same form is used as a demonstrative and as a marker of definite- 
ness, but this form occurs on different sides of the noun, depending on its func- 
tion. In Swahili, the forms that are used as distal demonstratives when following 
the noun function as markers of definiteness when they precede the noun, as 
shown in (24). Since Swahili is SVO, this difference conforms to the contrast in 
the crosslinguistic data shown above. 


(24) Swahili (Niger-Congo, Bantoid; Ashton 1947: 59) 

a. m-tu  yu-le 
NC,-man NC,-that 
‘that man’ 

b. yu-le m-tu 
NC4-DEF NC,-man 
the man' 

In Abui, we find the opposite situation: the form do functions as a demonstra- 


tive when it precedes the noun, as in (25a), but as a marker of definiteness when 
it follows the noun, as in (25b). 


?If we compute the proportions to three decimal places, DemN is also higher among OV lan- 
guages compared to VO languages in Oceania (by .441. to .434). However, this difference is too 
small to base any conclusion on. 
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(25) Abui (Timor-Alor-Pantar: Indonesia; Kratochvíl 2007: 111, 114) 


a. do sura 


this book 
“this book (near me) 


b. kaai do 
dog DEF 


“the dog (I just talked about)’ 


Significantly, Abui is an OV language, so the fact that Abui exhibits the oppo- 
site pattern from what we saw in Swahili again conforms to the crosslinguistic 
pattern described above. 

The situation in Ute is similar to that in Abui. Namely Ute is OV and the word 
"u functions as a demonstrative when it precedes the noun, as in (26a), but as a 
marker of definiteness when it follows the noun, as in (26b). 


(26) Ute (Uto-Aztecan: United States; Givón 2011: 50, 38) 


DA 


a. U kava sá-gha-ru-mu qháru-kwa-puga. 
that.sBy horse.sBy white-have-NMLZ-ANIM.SBJ run-go-REM 
“That white horse ran away: 


a 


b. ta’wa-chi u sivaatu-chi ^ paqha-qa. 
man-ANIM.SBJ DEF.SBJ goat-ANIM.OBJ kill-ANT 


"Ihe man killed a goat? 


The situation in Loniu is somewhat different. In Loniu, the definiteness marker 
and demonstrative are similar in form, though not identical, with iy as the def- 
initeness marker and iyo as the demonstrative. The two in fact can co-occur as 
in (27), with the definiteness marker preceding the noun, and the demonstrative 
following the noun. 


(27) Loniu (Austronesian, Oceanic: Papua New Guinea; Hamel 1994: 100) 

iy amatiyo 

DEF man this 

'this man' 
Again, since Loniu is VO, this order difference conforms to the crosslinguistic 
pattern described above. 
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And we find similar phenomena in cases where the definiteness marker and 
demonstrative are completely different in form but can co-occur, with one pre- 
ceding the noun and one following. In Kana, the definiteness marker precedes 
the noun while the demonstrative follows, as in (28). 


(28) Kana (Niger-Congo, Delta Cross: Nigeria; Ikoro 1996: 70) 
ló bari ämä 
DEF fish this 
“this fish’ 


Since Kana is VO, this conforms to the crosslinguistic pattern. Contrast this with 
the situation in Kwoma (Washkuk), which is OV, and in this case it is the demon- 
strative that precedes the noun and the definiteness marker that follows, as in 
(29). 


(29) Kwoma (Sepik: Papua New Guinea; Kooyers 1974: 49) 
kata ma rii 
that man DEF 


‘that man’ 


These differences between demonstratives and definiteness markers are a puz- 
zle if demonstratives are the primary grammaticalization source for definiteness 
markers. It should be emphasized, however, that although definiteness markers 
and demonstratives exhibit very different patterns in terms of how they corre- 
late with the order of verb and object, it is still the case that they correlate with 
each other, i.e. that the order of definiteness marker and noun and the order of 
demonstrative and noun correlate. This is shown in Tables 27 and 28, excluding 
languages where the definiteness marker is the same as the demonstrative. Ta- 
ble 27 shows that among DefN languages with definiteness markers that are dis- 
tinct from demonstratives, it is approximately twice as common for the demon- 
strative to precede the noun as well, by 41 genera to 20. 


Table 27: Order of noun and demonstrative in DefN languages 


Africa Euras Oceania N.Amer S.Amer TOTAL 


DefN & DemN 3 [7] [12] [11] [8] 41 
DefN & NDem [4] 3 7 3 3 20 
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Conversely, Table 28 shows that among NDef languages with definiteness 
markers that are distinct from demonstratives, it is much more common for the 
demonstrative to follow the noun as well, by 67 genera to 11. 


Table 28: Order of noun and demonstrative in NDef languages 


Africa Euras Oceania N.Amer S.Amer TOTAL 


NDef & DemN 4 3 2 1 1 11 
NDef & NDem [33] [6] [19] [8] 1 67 


While grammaticalization probably plays some role in explaining this correla- 
tion, it seems likely that the clear semantic similarity between definiteness mark- 
ers and demonstratives plays a role as well. There is also a correlation between 
the order of definiteness marker and noun and the order of indefinite marker and 
noun, a correlation that is presumably due to semantic similarity or processing, 
not grammaticalization. 


5 Conclusion 


I have argued that there is evidence that any approach to explaining word order 
correlations that ignores the role of grammaticalization is inadequate. At the 
same time, I have argued that while grammaticalization plays a role in explaining 
some correlations, a pure grammaticalization approach fails as well. 

Although I have focused my discussion of SVO & GenN languages on those 
with both prepositions and postpositions, further research is needed on SVO & 
GenN languages with prepositions as the only or dominant type or with postpo- 
sitions as the only or dominant type. Grammaticalization theory would predict 
that SVO & GenN languages with prepositions will be ones where the primary 
source of adpositions is verbs, while SVO & GenN languages with postpositions 
will be ones where the primary source of adpositions is head nouns in genitive 
constructions. I suspect that this is true and if so, it would further bolster the 
argument that grammaticalization plays an important role in explaining correla- 
tions involving adpositions. One reason to suspect it is true is the geographical 
distribution of the two types of languages. My database includes 21 genera con- 
taining SVO & GenN languages with prepositions and 13 of these genera (almost 
two thirds of them) are in an area stretching from China and Southeast Asia 
through Austronesian. The fact that so many of the SVO & GenN languages are 
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in this region is significant since my impression is that the grammaticalization 
of adpositions from verbs is especially common in this region. Conversely, my 
database includes 19 genera containing SVO & GenN languages with postposi- 
tions and only two of these genera are in the region mentioned above stretching 
from China through Austronesian where SVO & GenN & Pr languages are com- 
mon. I suspect that this is because outside that region, it is more common for 
adpositions to grammaticalize from nouns. However, this is a matter for future 
research. 


Abbreviations 


The paper abides by the Leipzig Glossing Rules. Additional abbreviations include 
the following ones: 


ANIM animate EXPEC expectational 

ANT anterior NC noun class 

CONSTR construct state REM remote 
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Chapter 5 


Preposed adverbial clauses: Functional 
adaptation and diachronic inheritance 


Holger Diessel 


University of Jena 


In the historical literature it is commonly assumed that subordinate clauses are 
derived from paratactic sentences. However, while this assumption is not implau- 
sible for certain types of postposed adverbial clauses, there is no obvious connec- 
tion between preposed adverbial clauses and parataxis. This paper investigates the 
diachronic development of preposed adverbial clauses from a cross-linguistic per- 
spective. Drawing on data from a typological and diachronic database, it is Shown 
that preposed adverbial clauses evolve from various diachronic sources that are 
semantically and structurally similar to the target construction (e.g. adpositional 
phrases, pre- and postnominal relative clauses, juxtaposed sentences). Considering 
the factors behind these developments, the paper argues that while the occurrence 
of preposed adverbial clauses can be explained by general cognitive processes of 
language use, the internal structure of preposed adverbial clauses, notably the po- 


sition of the subordinator, is primarily determined by grammaticalization. 


1 Introduction 


It is a standard assumption of historical linguistics that syntactic structures of- 
ten develop from structurally independent elements in discourse (Givón 1979). 
An oft-cited example is the diachronic development of subordinate clauses from 
paratactic sentences. As Lehmann (1988) and others have shown, there is a cline 
of clause linkage ranging from the combination of two structurally independent 
sentences in discourse to tightly organized bi-clausal structures in which one 
clause is syntactically dependent on the other one. Building on this observation, 
it is commonly assumed that subordinate clauses have evolved from indepen- 
dent sentences or parataxis (e.g. Hopper & Traugott 2003: 176-184). However, 
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while this assumption appears to be plausible for many postposed subordinate 
clauses, there is no obvious connection between parataxis and preposed subordi- 
nate clauses. 

Clause combining in discourse has a backwards orientation. Paratactic sen- 
tences are usually related to previous sentences, as evidenced by the occurrence 
of anaphoric pronouns and clause linkers that connect the current sentence to 
participants and propositions of the preceding sentence or discourse (1). 


+- 
(1) John; was accepted to Harvard. Therefore, he; moved to Boston. 
A ] 


Like independent sentences, complex sentences are processed with a back- 
wards orientation if the subordinate clause follows the main clause (e.g. John; 
moved to Boston, because he; was accepted to Harvard). However, unlike paratac- 
tic sentences, preposed subordinate clauses have an inherent forward orientation 
in that pronouns and clause linkers are related to elements of the upcoming main 
clause (2). 


© ---------------------- > 
(2) Because he; was accepted to Harvard, John; moved to Boston. 
A 


Considering the projective force of preposed subordinate clauses, it is unclear 
if and how these structures have evolved from clause combining strategies in dis- 
course. It is the purpose of this paper to investigate the diachronic developments 
of preposed subordinate clauses from a cross-linguistic perspective. Specifically, 
the paper is concerned with the development of preposed adverbial clauses. 

Following Cristofaro (2003), adverbial clauses are here defined as part of a 
biclausal construction consisting of a main clause and a subordinate clause in 
which the event designated by the subordinate clause specifies the circumstances 
under which the event of the main clause takes place. Several typological stud- 
ies have investigated the positional patterns of adverbial clauses (e.g. Greenberg 
1963; Diessel 2001; Schmidtke-Bode 2009; Diessel & Hetterle 2011; Hetterle 2015); 
but they are either based on small and biased samples or concentrate on par- 
ticular adverbial relations (e.g. purpose or cause). In the current study, we will 
be concerned with four general semantic types of adverbial clauses (i.e. adver- 
bial clauses of time, condition, cause and purpose) based on data from a genet- 
ically and geographically dispersed convenience sample of 100 languages. The 
languages come from 85 genera (which maximally include two languages) and 
six large geographical areas (i.e. Eurasia, Africa, South East Asia and Oceania, 
Australia and New Guinea, North America, South America) (cf. Dryer 1992). The 
bulk of the data were gathered from reference grammars and other published 
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sources, supplemented by information from native speakers and language spe- 
cialists.! 

The paper is divided into three parts. The first part describes the cross-linguis- 
tic distribution of preposed adverbial clauses in the 100 language sample; the 
second part provides an overview of the main diachronic paths to preposed ad- 
verbial clauses; and the third part considers the developments described in light 
of the debate about functional and diachronic explanations for language univer- 
sals that takes center stage in the present volume. 


2 Cross-linguistic patterns 


Let us begin with some general observations regarding the position of subor- 
dinate clauses. Subordinate clauses are dependent categories of an associated 
element. Three basic types of subordinate clauses are commonly distinguished: 
(i) complement clauses, which are dependent categories of a complement-taking 
verb or predicate, (ii) relative clauses, which are dependent categories of a noun 
or noun phrase, and (iii) adverbial clauses, which may be seen as dependent cat- 
egories of a main clause or main clause predicate. 

The position of all three types of subordinate clauses relative to the associ- 
ated element correlates with the position of other dependent categories relative 
to the so-called head, but the correlations are skewed in particular directions 
(Diessel 2001). As Greenberg (1963) already noted, the order of relative clause 
and noun correlates with that of object and verb, but there is a predominance of 
postnominal relative clauses. In VO languages, relative clauses are almost always 
postposed to the associated N(P), but in OV languages we find both prenominal 
and postnominal relatives (cf. Dryer 2005). 

The order of complement clause and verb is similar. As Schmidtke-Bode & 
Diessel (2017) have shown, although object complement clauses usually serve the 
same syntactic function as object NPs, they do not always occur in the same struc- 
tural position as nominal objects. In VO languages, complement clauses follow 
the verb with almost no exception, but in many OV languages they are postposed 
to the main verb, as for instance in Persian, Epena Pedee and Supyire. There is 
thus a general tendency for both relative and complement clauses to follow the 
associated category, which may be due to the oft-noted trend for long and heavy 
constituents to follow short ones (cf. Behaghel 1932). 


1A list of languages included in the sample is given in the Appendix. 
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However, adverbial clauses are different. Although adverbial clauses are long 
constituents, they often precede the main clause. As Diessel (2001) observed 
(based on data from a small and biased sample), in VO languages, adverbial clauses 
occur both before and after the associated main clause, but in some OV languages, 
there is a general tendency to prepose all adverbial clauses. This tendency is also 
evident in the current sample (cf. Table 1). 


Table 1: The order of adverbial clause (AC) and main clause (MC) and 
the order of verb and object 


Languages in which Languages in which Languages in which Total 


all types of ACs ACs are commonly all types of ACs 

(usually) precede pre- and postposed (usually) follow the 

the MC MC 
VO » 40 = 40 
VO/OV : 8 - 8 
OV 31 21 - 52 
Total 31 69 - 100 


As can be seen, most of the languages of the current sample make common 
use of both pre- and postposed adverbial clauses, but in more than half of all OV 
languages, adverbial clauses are usually preposed to the main clause. In Japanese, 
for instance, there is a very strong tendency to prepose adverbial clauses (though 
in spoken Japanese, adverbial clauses sometimes follow the main clause as af- 
terthoughts; cf. Ford & Mori 1994). 

Generalizing across the data in Table 1, we may say that while the order of 
adverbial clause and main clause correlates with that of object and verb, the oc- 
currence of preposed adverbial clauses is cross-linguistically predominant. How- 
ever, on closer inspection we find that the predominance of preposed adverbial 
clauses is mainly due to certain semantic types of adverbial clauses that precede 
the main clause in both VO and OV languages. Consider the data in Table 2, 
which show that the positional patterns of adverbial clauses correlate with their 
meaning. 

Note that the frequencies in Table 2 are based on constructions rather than on 
languages. Since some languages have multiple adverbial clause constructions of 
the same semantic type, Table 2 includes a larger number of constructions than 
languages. Note also that this table concerns both adverbial clauses that are tied 
to a specific position by linguistic convention and adverbial clauses that are sta- 
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Table 2: The meaning and position of adverbial clause constructions in 
a sample of 100 languages 


mn Pre- and postposed Postposed Total 


Condition 4 [91.3%] 9 [8.7%] 0 [0%] 103 
Time 119 [59.8%] 68 [34.2%] 12 [6.0%] 199 
Cause 40 [38.8%] 24 [21.2%] 49 [43.4%] 113 
Purpose 33 [28.7%] 9 [16.5%] 63 [54.8%] 115 


Total 286 120 124 530 


tistically biased to precede or follow the main clause. In the latter case, some of 
the data in Table 2 are based on frequency counts from linguistic corpora, but 
more often these data are based on field workers’ judgements regarding the po- 
sition of adverbial clauses. While expert judgements are less reliable than corpus 
counts, they provide a reasonable estimate as to how main and adverbial clauses 
are arranged in a particular language.” 

As can be seen, conditional clauses typically precede the main clause (cf. Green- 
berg 1963: Universal 14), though in many languages, conditional clauses can also 
be postposed to the main clause. Like conditional clauses, temporal clauses tend 
to precede the main clause, but temporal clauses follow the main clause more 
often than conditionals. The position of temporal clauses varies with the nature 
of the temporal link they encode. For instance, temporal clauses denoting a prior 
event, i.e. an event that precedes the one in the main clause, are more often pre- 
posed than temporal clauses denoting a posterior event. In English, for example, 
after- and since-clauses denote a prior event and precede the main clause more 
often than adverbial clauses denoting a posterior event such as before- and un- 
til-clauses (cf. Diessel 2008). The same tendency has been observed in several 
other languages of the current sample (e.g. in German, Supyire, Abun, Nkore 
Kiga, Noon, and Taba). 

Moreover, and this is particularly striking, there is a general tendency to pre- 
pose adverbial clauses that correspond to English when-clauses. Like after and 
since, when can denote a prior event, but it can also indicate a link between events 
that occur simultaneously (Diessel 2008). However, regardless of the temporal re- 


?Psycholinguistic evidence suggests that while speakers have difficulties to estimate the abso- 
lute frequencies of linguistic elements, their judgements of relative linguistic frequencies are 
quite reliable (Hasher & Zacks 1984). 
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lationship that is expressed by a when-clause, there is a tendency for temporal 
when-clauses to precede the main clause. In fact, in a substantial number of lan- 
guages when-clauses are generally preposed to the main clause in the current 
sample (i.e. Abun, Supyire, Yagua, Trumai, Motuna). 

Finally, cause and purpose clauses tend to follow the main clause. Table 2 
shows that there are 40 adverbial clause constructions of cause and 33 adverbial 
clause constructions of purpose that precede the main clause, but most of these 
constructions occur in languages like Japanese, in which all adverbial clauses 
are preposed to the main clause regardless of their meaning. Generalizing across 
these findings we may conclude that the cross-linguistic tendency to prepose ad- 
verbial clauses is mainly due to the fact that conditional clauses and certain types 
of temporal clauses, notably when-clauses, precede the main clause regardless of 
the order of other syntactic constituents. 

Interestingly, a number of studies suggest that the position of adverbial clauses 
does not only correlate with the semantic link between main clauses and adver- 
bial clauses, but also with aspects of their internal structure. Of particular impor- 
tance here is the position of the subordinator (cf. Diessel 2001; Schmidtke-Bode 
2009; Hetterle 2015). Across languages, adverbial clauses are often marked by 
subordinate conjunctions that typically appear at the beginning or end of the 
subordinate clause. Dryer (1992) showed that the position of the subordinator 
correlates with the order of verb and object: In VO languages, adverbial clauses 
usually occur with initial subordinators, but in OV languages they often include 
a final marker. However, the position of the subordinator does not only correlate 
with the order of verb and object, it also correlates with the position of the adver- 
bial clause. Consider the data in Table 3, which is restricted to adverbial clauses 
with free subordinating morphemes.? 

As can be seen,adverbial clauses that follow the main clause or that are flex- 
ible with regard to their position typically occur with an initial marker. There 
are languages in which postposed and flexible adverbial clauses include a final 
marker, but this is relatively rare (and mainly found in certain areas, e.g. South 
America). By contrast, preposed adverbial clauses are frequently marked by a fi- 
nal subordinator, especially in languages in which all adverbial clauses precede 
the main clause, as for instance in Amele, Burmese, Japanese, Korafe, Korean, 
Santali, Slave, Turkish, Wappo, Warao, and Menya. Only conditional clauses and 
temporal when-clauses are commonly preposed and often marked by an initial 
subordinator (in languages in which other semantic types of adverbial clauses 
are flexible or postposed to the main clause). 


“Since adverbial clause constructions that do not include a free subordinating morpheme are 
disregarded, Table 3 includes only a subset of the adverbial clause constructions in Table 2. 
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Table 3: The position of free subordinators in pre- and postposed ad- 
verbial clauses 


Flexible 
Preposed (no preference) Postposed 


Initial Final Initial Final Initial Final Total 


Condition 34 22 5 - - - 61 
Time 20 47 43 5 7 3 125 
Cause 2 26 11 6 37 4 86 
Purpose - 20 2 4 38 4 68 


Total 56 115 61 15 82 11 340 


3 Diachronic sources 


Having described the positional patterns of adverbial clauses (and adverbial sub- 
ordinators), let us now consider their diachronic evolution. Where do preposed 
adverbial clauses come from? In the historical literature, syntactic development 
is commonly described as a process that leads from a source construction A to 
a target construction B, but this scenario is not always appropriate to character- 
ize syntactic change (cf. Givón 1991; Van de Velde et al. 2013). Since subordinate 
clauses are complex grammatical units, they are usually related to several other 
constructions, e.g. other types of subordinate clauses, certain types of phrasal 
constituents and independent sentences. Since all of these constructions can in- 
fluence the development of a particular adverbial clause, it is not always possible 
to trace adverbial clauses to one specific source. However, while the diachronic 
developments of adverbial clauses are (usually) influenced by several construc- 
tions, in many cases there is one construction that is so closely related to a cer- 
tain type of adverbial clause that it can be seen as the primary determinant, or 
source, of that clause. For instance, many postposed adverbial clauses are so simi- 
lar to paratactic sentences that it seems reasonable to assume that parataxis has a 
significant impact on the development of (many) postposed subordinate clauses. 
However, while the development from parataxis provides a plausible scenario for 
the rise of (many) postposed adverbial clause, it does not explain where preposed 
adverbial clauses come from. 

Since preposed adverbial clauses are thematically related to the ensuing dis- 
course, there is no obvious connection to parataxis unless we assume that pre- 
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posed adverbial clauses are based on postposed subordinate clauses that were 
fronted after they developed from paratactic sentences. However, there is no ev- 
idence for this scenario. The diachronic developments of adverbial clauses have 
been examined in a large number of studies (e.g. Haiman 1985; Haspelmath 1989; 
Givón 1991; Genetti 1991; Harris & Campbell 1995; Frajzyngier 1996; Disterheft 
& Viti 2010), but although fronting appears to provide a plausible scenario for 
the development of preposed adverbial clauses, there is almost no evidence for 
this scenario in the historical literature. On the contrary, what previous stud- 
ies suggest is that adverbial clauses usually occur in the same position as their 
diachronic sources. In what follows, we consider four common source construc- 
tions for preposed adverbial clauses. 

First, while preposed adverbial clauses are unlikely to have evolved from parat- 
actic sentences through fronting, there is one common diachronic path that leads 
from independent sentences in discourse to complex sentences with preposed 
adverbial clauses. As Haiman (1985: 39-70) observed, in many languages condi- 
tional relations are expressed by juxtaposed clauses that have the same structure 
as two simple sentences, as in the following examples from Vietnamese (3), Ma- 
pudungun (4) and Wambaya (5). 


(3) Vietnamese (Austro-Asiatic, Viet-Muong; Haiman 1985: 45) 
[Khóng có màn], khóng chiu nói. 
not benet not  bearcan 


‘If there's no net, you can't stand it’ 


(4 Mapudungun (Araucanian; Smeets 2008: 184) 
[Aku-wye-fu-l-m-i], pe-pa-ya-fwi-y-m-i. 
arrive-PLPF-IPD-COND-2-sG see-hither-IRR-OBJ-IND-2-sG 


‘If you had arrived (by then), you would have seen him! 


(5 Wambaya (West Barkly; Nordlinger 1998: 219) 
[Yabu ng-uda gijilulu] jiyawu ng-uda. 
have 1sG.A-NACT.PST money.IV(ACC) give — 15G.A-NACT.PST 


‘If Td had the money I would have given (it to her)? 


While some of these languages have conditional markers (e.g. Vietnamese néu 
‘if’), conditional relations are commonly expressed by unmarked sentences that 
have the same structure as main clauses: they include finite verb forms, occur 
with the same arguments and adjuncts as independent sentences, and do not 
include an (obligatory) subordinate marker. Note, however, that while these con- 
structions look like independent sentences, they are intonationally bound to the 
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ensuing clause and sometimes constrained with regard to verb inflection. The 
conditional clause in Mapudungun, for instance, takes a mood suffix that is op- 
tional in main clauses but obligatory in conditionals. Moreover, in some lan- 
guages these constructions occur with a topic or focus marker that one might 
analyze as a subordinator, such as the focus clitic at the end of the protasis in 
example (6) from Mangarayi. 


(6) Mangarayi (Isolate; Merlan 1982: 22) 
[Aa-yay-gu=bayi] wawg wa-fian-mi biwin-gana. 
2SG-go-DI-FOC follow IRR-1SG>2sG-AUxX behind-ABL 


‘If you go, I will follow (after) you’ 


In addition to conditional clauses, preposed temporal clauses are sometimes 
based on juxtaposed sentences (e.g. in Lao, Vietnamese, Taba, Tetun, Gooniyandi); 
but preposed cause and purpose clauses are usually based on other types of con- 
structions. Adpositional phrases, for instance, are often closely related to (pre- 
posed) cause and purpose clauses. Consider, for instance, the following examples 
from Turkish (7) and Amele (8), in which cause and purpose clauses are marked 
by benefactive postpositions. 


(7) Turkish (Turkic; Kornfilt 1997: 74) 
Hasan [kitab-ı san-a ` ver-dié-im icin] cok kız-dı. 
Hasan book-Acc you-DAT give-F.NMLZ-lsG for very angry-PST 


“Hasan got very angry because I gave the book to you: 


(8) Amele (Nuclear Trans New Guinea, Madang; Roberts 1987: 58) 
[ja sab faj-ec nu] h-ug-a. 
1sc food buy-INF/NMLZ for come-1sc-PST 


‘I came to buy food. 


Note that the adverbial clauses in both examples are expressed by nominaliza- 
tions. While adpositions and case affixes are also found with finite clauses, they 
are especially frequent with nominalized clauses, suggesting that nominalization 
provides a link between adpositional phrases and fully developed (subordinate) 
clauses (cf. Deutscher 2009; Heine 2009). 

Adverbial clauses that are morphologically related to adpositional phrases are 
widely used to express semantic relations of cause and purpose. In addition, cer- 
tain types of temporal clauses denoting a prior or posterior event are often strik- 
ingly similar to (temporal) adpositional phrases (e.g. English after-, since- and 
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before-clauses) (Blake 1999; Hetterle 2015); but conditional clauses and temporal 
when-clauses are only rarely marked by adpositions. 

Apart from juxtaposed sentences and adpositional phrases, relative clauses 
provide a very frequent source for (preposed) adverbial clauses. The develop- 
ment is well-known from English. As Hopper & Traugott (2003) have shown, 
temporal while-clauses have evolved from a relative or appositive construction 
that modified a generic head noun meaning ‘time’ (9). 


(9) Old English (Indo-European, Germanic; Hopper & Traugott 2003: 90) 
& wicode ber Pa hwile [Pe manPa burg worthe 
and lived there that.DAT time.DAT that one that fortress worked.on 
& getimbrode]. 
and built 
*... and camped there at the time that/while the fortress was worked on 
and built. 


Similar types of adverbial clauses occur in many other languages of the current 
sample (e.g. in Mayogo (10) and Togabagita (11)). Sometimes the subordinator is 
based on a generic noun, and sometimes it is based on a relative marker (as for 
instance many of the adverbial subordinators in Tamasheq; cf. Heath 2005: 660). 


(10) Mayogo (Niger-Congo, Ubangi; Sawka 2001: 153) 
[Nedhinga u a-zu he], ndili-e a-si kuto. 
while/time 3PL PsT-eat thing child-REF pst-sleep down 


"While they ate something, this child slept on (the) floor’ 


(11) Toqabaqita (Austronesian, Oceanic; Lichtenberk 2008: 1173) 
[Si manga na kero fula mai], keko gono qa-daroqa 
PRTT time REL 3DU.NON.FUT arrive VENT 3DU.SEQ Sit SBEN-3DU.PERS 


"When (lit. ‘the time that’) they arrived, they sat (down) ..? 


The development is especially frequent with temporal when- and while-clauses, 
but there are also other semantic types of adverbial clauses that are based on rel- 
ative clauses in my data. In German, for instance, cause and condition clauses are 
marked by adverbial subordinators (i.e. weil and falls) that are based on nominal 
heads of relative or appositive clauses meaning ‘time (span) and ‘case’. Moreover, 
at least 25 languages of the current sample have conditional clauses based on tem- 
poral when/while-clauses (which at least in some cases are ultimately based on 
relative clauses). Note that this development does not only involve postnominal 
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relatives but also prenominal and internally headed relative clauses, as illustrated 
by the following examples from Amele (12), Korean (13) and Jamsay (14).* 


(12) Amele (Nuclear Trans New Guinea, Madang; Roberts 1987: 57) 
[ja cabi meul ceh-ig-en sain eu na]ma ca cetaca mun 
1sG garden new plant-1sc-ruT time that at taro add yam add banana 
ca maninca ceh-ig-en. 
add bean add plant-1sG-FUT 


"When I plant my new garden, I will plant taro, yam, banana and beans’ 


(13) Korean (Isolate; Sohn 1994: 70) 


Na-nun [pi-ka ^ w-ass-ul ttay-(ey)] ttena-ss-ta. 
ITC rain-NOM come-PST-PROSP time-at leave-PST-DECL 
‘I left when it had rained. 


(14) Jamsay (Dogon; Heath 2008: 559) 
[Wárü dögürü ü gô:-Ø] 
farming time 25G.5BJ go:out.PFV-PTCP.NON.HUMAN 


‘At the time when you (first) went out to do the farming, ..: 


Finally, preposed adverbial clauses are also often influenced by complement 
clauses. In Middle English, for instance, adverbial subordinators were frequently 
accompanied by the complementizer that (e.g. after that, since that, gif that), 
which is still commonly used in result clauses (cf. so that). Likewise, in Chal- 
catongo Mixtec, most adverbial clauses are marked by the complementizer xa-, 
which also appears in complement and relative clauses (Macaulay 1996: 156-168). 
Moreover, there is a well-known path that leads from quotative constructions, 
which in many languages are similar to complement clauses, to adverbial clauses. 
In particular, purpose and cause clauses are sometimes derived from quotatives 
(cf. Güldemann 2008). 

Quotative constructions consist of a “quote index”, including a “quotative mark- 
er", and a “quote clause" of direct speech that often shows little evidence for em- 
bedding (cf. Güldemann 2008). In many cases, the quotative marker is a general 
verb of saying (e.g. ‘say’, speak"), but it can also be a marker of similarity (e.g. 
like) or a manner deictic (e.g. "sol, Although quote clauses are often not embed- 
ded in the associated clause, the quotative verb takes the quote clause as some 


“According to Epps (2009), Hup has adverbial clauses that are based on headless relative clauses. 
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kind of semantic argument, which typically occurs in the same position as a di- 
rect object.? When this happens in OV languages, the consequence is that quote 
clauses precede the quotative verb. If these constructions are extended into the 
domain of adverbial subordination, the adverbial clause is preposed to the main 
clause (or main verb) and marked by a clause-final subordinator that is ultimately 
based on the quotative verb, as in the following examples from Aguaruna (15) and 
Lezgian (16). 


(15) Aguaruna (Jivaroan; Overall 2009: 175) 
Nuwa-na [yumi fikika-ta tu-sá] awima-wa. 
woman-ACC water draw.ASP-IMP say-SUB.3.ss send.ASP-NON.A/S>A/S 
“When (they) sent a woman to draw water, ...! (lit. “saying “draw some 
water, ...”’) 


(16) Lezgian (Nakh-Daghestanian; Haspelmath 1993: 390) 
Bazar.di-n jug ada-z [tars-ar awa-¢ luhuz] tak’an 
Sunday-GEN day he-DAT lesson-Pr be.in-NEG saying hateful 
xa-nwa-j. 
become-PRF-PST 


‘He hated Sunday because there were no lessons: 


Table 4 provides an overview of the various sources for preposed adverbial 
clauses considered in this section. Let me emphasize that this table simplifies in 
several ways. First, as pointed out above, the development of adverbial clauses 
is usually influenced by multiple constructions so that there are often several 
source constructions (though one of them is often dominant). Second, there are 
frequent diachronic connections between the various semantic types of adverbial 
clauses that are not indicated in Table 4 except for the development of temporal 
when/while-clauses into conditional clauses, which is particularly frequent. Third, 
there is reason to assume that postposed adverbial clauses can influence the 
structure of preposed adverbial clauses through analogical extension (cf. Trau- 
gott 1985). Fourth, in addition to the eight source constructions shown in Table 4, 


“Munro (1982) and Güldemann (2008) point out that quote clauses do not generally occur in 
the same position as direct objects, which is one reason why these researchers argue that 
quote clauses are not (always) complements. However, while quote clauses are often less tightly 
integrated into a clause (or VP) than direct objects, they are related to object complement 
clauses by family resemblance and since object complement clauses pattern with object NPs, 
there is also a tendency for quote clauses to occur in the same position as direct objects (see 
Schmidtke-Bode & Diessel 2017 for some discussion of the relationship between quote clauses, 
object complement clauses and nominal objects from a cross-linguistic perspective). 
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Table 4: Frequent source constructions of preposed adverbial clauses 


Condition juxtaposed sentences (Haiman 1985) 
temporal when/while-clauses (Traugott 1985) 


Time adpositional phrases / nominalizations (Genetti 1991) 
pre- and postnominal relative clauses (Givón 1991) 


Cause adpositional phrases / nominalizations (Genetti 1991) 
quotative / complement constructions (Ebert 1991) 


Purpose adpositional phrases / nominalizations (Schmidtke-Bode 2009) 
quotative / complement constructions (Güldemann 2008) 


there are other (less frequent) source constructions of preposed adverbial clauses 
that have been disregarded. And finally, there is evidence that constructional 
change typically proceeds in a local fashion that is driven by language users' ex- 
perience with particular lexical expressions (e.g. Givón 1991), but this has been 
ignored in the above discussion. In order to account for all of these factors, one 
would need a different theoretical approach — perhaps some kind of network 
model, in which adverbial clauses are linked to various other types of construc- 
tions that simultaneously affect their use and their development (see Diessel 2015 
for some discussion of such a model). However, in what follows we concentrate 
on the idealized developments that are summarized in Table 4. 


4 Discussion: Functional adaptation and/or persistence 


To recapitulate, we have seen that the occurrence of preposed adverbial clauses 
correlates with the position of other grammatical categories and the semantic 
relationship between main and adverbial clause (82), and we have seen that con- 
dition, time, cause and purpose clauses develop from, or under the influence of, a 
wide range of constructions ($3). Concluding the paper, let us ask what leads to 
the development and cross-linguistic distribution of preposed adverbial clauses. 

Many linguistic typologists assume that language universals are motivated by 
semantic and pragmatic factors that influence the diachronic developments of lin- 
guistic structure. On this view, cross-linguistic regularities are functional adap- 
tations to communication and processing (e.g. Foley & Van Valin 1984; Dik 1989; 
Hawkins 2004). However, as particularly Cristofaro (2019 [this volume]) and 
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Collins (2019 [this volume]) argue, there is an alternative approach that stresses 
the importance of diachronic inheritance, or persistence, for the rise of language 
universals. In this approach, cross-linguistic tendencies, or statistical universals, 
are the by-product of diachronic processes that are Nor immediately motivated 
by functional-adaptive factors. In the remainder of this paper, I argue that the 
cross-linguistic tendencies in the linear organization of adverbial clauses are the 
result of both functional aspects of language use and persistence effects of gram- 
maticalization. 

PREPOSED ADVERBIAL CLAUSES IN HEAD-FINAL LANGUAGES. Given that clause 
combining in discourse has a strong backwards orientation, one might wonder 
why adverbial clauses are not generally postposed. However, there are several 
reasons why languages prepose adverbial clauses. To begin with, above we have 
seen that the positional patterns of adverbial clauses vary with their meaning, but 
in some rigid OV languages, they are consistently preposed to the main clause, 
suggesting that the order of main and adverbial clauses is part of the traditional 
VO/OV typology (cf. Diessel 2001). 

There are numerous proposals in the literature to explain correlations between 
the order of verb and object and that of other grammatical categories. Especially 
prominent is Hawkins' processing approach, in which word order correlations 
are explained by general principles of syntactic processing that are assumed to in- 
fluence both language use and language change (cf. Hawkins 1994; 2004). Specifi- 
cally, Hawkins proposed that head-final or OV languages tend to prepose depen- 
dent categories, including subordinate clauses, because syntactic structures with 
consistent dependent-head orders are easier to process, and thus more strongly 
preferred, than structures with mixed or inconsistent dependent-head orders (see 
Dryer 1992 for a similar explanation). 

The processing approach provides a straightforward explanation for the dom- 
inant use of preposed adverbial clauses in OV languages, but as Krifka (1985) and 
others have noted, word order correlations can also be explained by analogy or 
similarity. There is abundant evidence from psycholinguistic research that speak- 
ers tend to arrange semantically or formally similar expressions in parallel po- 
sitions (see Pickering & Ferreira 2008 for a review of psycholinguistic research 
on the influence of structural priming on linear order). Like objects, adverbial 
clauses are dependent categories, but other than that, adverbial clauses do not 
seem to have much in common with object NPs, making it rather unlikely that 
analogy and similarity account for this correlation. However, if we broaden the 
perspective and include other types of constructions into the analysis, there is 
reason to assume that the correlation between adverbial clauses and object NPs 
is due to analogical pressure that affects a whole network of constructions. 
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To begin with, adverbial clauses are often similar to adpositional phrases func- 
tioning as adjuncts, and since the latter are usually similar to object NPs, ad- 
verbial clauses are also related to direct objects (via adjuncts). As Dryer (1992) 
showed, there is a very strong tendency to place nominal objects and (certain se- 
mantic types of) adjuncts on the same side ofthe verb and since adverbial clauses 
pattern like adjunct phrases, they also pattern with object NPs. 

Moreover, in many languages, adverbial clauses are expressed by the same or 
very similar types of constructions as complement clauses. Since complement 
clauses are also related to object NPs, we may hypothesize that the ordering cor- 
relation between complex sentences including adverbial clauses and verb phrases 
including nominal objects is (also) mediated by constructions including comple- 
ment clauses, as the latter share properties with both of them. 

Thus, while adverbial clauses do not have much in common with object NPs, 
they are similar to adpositional phrases and complement clauses, which in turn 
are similar to nominal objects, suggesting that OV (or head-final) languages pre- 
pose adverbial clauses in analogy to (preposed) adpositional phrases and comple- 
ment clauses (17). 


(17) 
CC-V 
NP-V AC-V 
PP-V 


PREPOSED ADVERBIAL CLAUSES IN HEAD-INITIAL LANGUAGES. Analogy is one fac- 
tor that can motivate the occurrence of preposed adverbial clauses in head-final 
languages, but since the occurrence of preposed adverbial clauses is not restricted 
to head-final languages, analogy alone is not sufficient to explain why adverbial 
clauses are commonly preposed. As we have seen, certain semantic types of ad- 
verbial clauses, notably conditional clauses and temporal when-clauses, precede 
the main clause in both head-initial and head-final languages. In order to explain 
these patterns, we have to consider the semantic and discourse-pragmatic prop- 
erties of adverbial clauses. 

As Chafe (1984), Givon (1984) and many others have pointed out, preposed ad- 
verbial clauses serve particular discourse-organizing functions. They provide a 
thematic ground or orientation for subsequent information, as evidenced by the 
fact that preposed adverbial clauses are often marked as topics (Haiman 1978). In 
addition, there are particular conceptual motivations to prepose certain seman- 
tic types of adverbial clauses. Conditional clauses, for instance, exhibit a strong 
tendency to precede the main clause, as conditionals are used to create a partic- 
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ular conceptual framework for the semantic interpretation of associated clauses 
(Diessel 2005), and some temporal clauses precede the main clause for reasons 
of iconicity (Diessel 2008). 

Considering the semantic and discourse-pragmatic functions of preposed ad- 
verbial clauses, we may hypothesize that these functions do not only influence 
speakers' use of a particular clause order (where there is synchronic choice) 
but also the development of preposed adverbial clauses in language change or 
language evolution. In particular, the initial stages of the development seem to 
be motivated by semantic and discourse-pragmatic factors. For instance, as we 
have seen, adverbial clauses are often based on relative clauses and adpositional 
phrases, which in VO languages usually follow the main verb (if we disregard 
center-embedded RCs), but may be fronted in order to provide an orientation, or 
topic, for the unfolding sentence. When the fronted constructions are routinely 
used for discourse-organizing functions, they may develop into preposed adver- 
bial clauses with the same or similar functions. 

Assuming that preposed adverbial clauses inherit their discourse functions 
from fronted relative clauses, adpositional phrases and similar constructions, one 
might argue that while discourse considerations motivate the use of the various 
source constructions, they do not immediately motivate the extension of these 
constructions to adverbial clauses, which seems to be a consequence of autom- 
atization, semantic bleaching and formal reduction rather than of discourse pro- 
cessing. However, since grammaticalization is a gradual process with no sharp 
division between source and target, I would contend that the influence of dis- 
course is not restricted to the initial uses of the source constructions but affects 
the entire course of the development. After all, automatization, semantic bleach- 
ing and formal reduction are driven by frequency of language use, which in turn 
is driven by the need to use fronted relative clauses, adpositional phrases or (in- 
cipient) adverbial clauses for particular discourse purposes. 

Thus, while one cannot say that preposed adverbial clauses have evolved to 
fill a functional gap within the linguistic system, it is still reasonable to con- 
ceive of them as functional adaptations to particular discourse environments, as 
preposed adverbial clauses develop under the continuing influence of discourse 
considerations. 

INITIAL AND FINAL SUBORDINATORS. Let us finally turn to the correlation be- 
tween the position of adverbial clauses and that of the subordinator. Recall that 
while postposed adverbial clauses are commonly introduced by a clause-initial 
conjunction, preposed adverbial clauses often occur with a final marker. In par- 
ticular, in languages in which adverbial clauses are generally preposed to the 
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main clause, the subordinator typically occurs at the end of the adverbial clause. 
There are two general explanations for the position of adverbial subordinators 
in pre- and postposed subordinate clauses: one refers to processing, the other to 
grammaticalization. 

In Hawkins' (1994; 2004) processing approach, the positional patterns of ad- 
verbial subordinators are explained by two general principles. To simplify, one 
principle predicts that the subordinator occurs at the boundary to the main clause 
because linear structures of this type have a short "recognition domain" that is 
easy to process and thus more highly preferred than structures with a long recog- 
nition domain. And the second principle predicts that there is a general tendency 
to place the subordinator at the beginning of the subordinate clause (regardless 
of clause order), because initial subordinators prevent the parser from misinter- 
preting subordinate clauses as main clauses (see also Diessel 2005: 455-459). 

Hawkins' theory provides a good fit to the data, but lacks a diachronic di- 
mension. As it stands, it is completely unclear how the word orders that are 
explained by syntactic processing in this approach have evolved in language his- 
tory. Haspelmath (2019 [this volume]) argues that functional explanations do not 
need diachronic evidence if they correctly predict the typological data; but I dis- 
agree with this view because functional explanations can turn out to be spurious 
when we consider how particular phenomena have evolved. 

In fact, there is evidence that the above described correlation between the po- 
sition of the subordinator and the position of the subordinate clause is just a by- 
product of grammaticalization processes that are not immediately influenced by 
syntactic processing. That grammaticalization can have an impact on the linear 
organization of syntactic constituents has been observed in previous research (Li 
& Thompson 1974). In fact, a number of studies have argued that (some) word or- 
der correlations are due to persistence effects in grammaticalization (e.g. Givón 
1975, Aristar 1991; Bybee 2010; Collins 2012; see also Collins 2019 [this volume] 
and Dryer 2019 [this volume]). 

For instance, according to Bybee (2010: 111), the correlation between the or- 
der of verb and object and that of verb and auxiliary does not need a particular 
functional explanation, as auxiliaries are usually derived from the main verb ofa 
complement construction that includes an infinitive, or some other type of verb, 
as verbal complement (e.g. “want” INFINITIVE). If the verb precedes the verbal 
complement of a complex VP in the diachronic source, the auxiliary precedes 
the main verb in the target construction; but if the verb follows the verbal com- 
plement in the diachronic source, the auxiliary is postposed to the main verb in 
the target construction. As a consequence of these developments, the order of 
auxiliary and verb correlates with that of verb and object (18). 
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(18 [VERB  [vERB]os]ve [[VERB]os; VERB] yp 
| | l | 
[AUX  VERB]yp [VERB AUX] yp 


It is conceivable that the correlation between the position of adverbial subordi- 
nators and that of adverbial clauses is also due to persistence effects of grammat- 
icalization. For instance, above we have seen that purpose clauses in Amele and 
cause/purpose clauses in Turkish are marked by a clause-final subordinator that 
also serves as a benefactive adposition in postpositional phrases. Since postposi- 
tional phrases usually precede all other constituents in Amele and Turkish (and 
most other head-final languages), it is a plausible hypothesis that the occurrence 
of final subordinators in these constructions is related to the fact that they are 
based on postpositions (of preposed adpositional constructions). 


(19 [[NP] P]pp [We]: 
b. | 
[[...S...SUB],o Tv... le 


In other cases, final subordinators are based on quotative verbs, as for instance, 
in some temporal and causal clauses of Aguaruna and Lezgian ((15)-(16)). Here 
again, the final position of the subordinator is likely to be a consequence of gram- 
maticalization. Since quotative clauses precede the quote verb in Aguaruna and 
Lezgian (and many other head-final languages), the final position of the subordi- 
nator is readily explained by the fact that it evolved from a quotative verb that 
followed the quote clause in the source construction. 


(20) [[ovore] v] EDT 
b. d | 
oV 


[[...s... SUB]ac | 


] SIMPLE S 


] uc] COMPLEX S 


Crucially, while Hawkins' processing approach can also account for the main 
trends in the data, it cannot explain the exceptional cases. For instance, while 
postposed and flexible adverbial clauses are usually marked by initial subordina- 
tors (as predicted by Hawkins), there are 26 postposed (and flexible) adverbial 
clause constructions in the data in which the subordinator comes at the end of 
the adverbial clause, as in example (21) from Yagua. 


(21) Yagua (Peba-Yaguan; Payne & Payne 1990: 340) 
Deerá-miy sąąniy-yąą [sa-tjjysja túunu]. 
child-corr shout-DISTRIB 3sc-play while 


"Ihe children are shouting while they play: 
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While the existence of these structures flies in the face of Hawkins' processing 
account, it has a straightforward diachronic explanation. As Payne & Payne (1990: 
340) point out, the subordinate conjunction comes at the end of the adverbial 
clause in (21) because túunu ‘while’ has evolved from a postposition meaning 
'side', and since postpositional phrases follow the verb in Yagua, the resulting 
adverbial clause includes a clause-final marker. 

Considering these examples, we may hypothesize that grammaticalization ac- 
counts for the occurrence of final subordinators in preposed adverbial clauses. 
However, since adverbial subordinators are derived from a wide range of sour- 
ces, it is unclear at this point if the grammaticalization account is sufficient to 
explain the cross-linguistic data. Moreover, even if it turns out that the position 
of the subordinator is primarily determined by grammaticalization, this does not 
necessarily exclude the possibility that processing also affects the position of the 
subordinator as an independent factor. More research is needed to determine the 
role of grammaticalization (and processing) on the development of word order 
correlations, but I suspect that the cross-linguistic distribution of initial and final 
subordinators is primarily caused by grammaticalization rather than by Hawkins' 
principles of syntactic processing. 


5 Conclusion 


To summarize the main points of this paper, we have seen that the position of 
adverbial clauses correlates with the meaning of adverbial relations and the po- 
sition of other grammatical categories that are similar to adverbial clauses. Since 
preposed adverbial clauses include a forward orientation that deviates from the 
dominant backwards orientation of clause combining in discourse, there is no 
obvious (diachronic) connection between preposed adverbial clauses and inde- 
pendent sentences. Only conditional and some temporal clauses that precede the 
main clause are (often) based on juxtaposed sentences that are oriented towards 
the subsequent clause. All other semantic types of preposed adverbial clauses de- 
velop from, or under the influence of, other (source) constructions: adpositional 
phrases and nominalizations, pre- and postnominal relative clauses, internally 
headed relatives, and quotative constructions. 

The positional patterns of adverbial clauses can be explained by functional and 
cognitive processes that influence both speakers' choice of a particular clause or- 
der in language use and the diachronic developments of pre- and postposed ad- 
verbial clauses from certain source constructions. Some of these processes affect 
the whole class of adverbial clauses (e.g. the discourse-organizing function that 
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motivates the occurrence of preposed adverbial clauses), others are only relevant 
for certain semantic types of adverbial relations (e.g. iconicity of sequence). Cru- 
cially, while the positional patterns of adverbial clauses are motivated by func- 
tional and cognitive aspects of language use, the position of the adverbial subor- 
dinator may just be a by-product of grammaticalization. Like the positional pat- 
terns of auxiliaries and other grammatical markers that evolved through gram- 
maticalization, the positional patterns of adverbial subordinators seem to be de- 
termined by the position of their diachronic sources. Since the various source 
constructions tend to occur in reverse orders in VO and OV languages, it is not 
improbable that the position of adverbial subordinators correlates with that of 
other grammatical categories in head-initial and head-final languages because of 
persistence effects in grammaticalization. However, more research is needed to 
investigate the cognitive and diachronic mechanisms behind these correlations. 


Abbreviations 


The paper abides by the Leipzig Glossing Rules. Additional or deviant abbrevia- 
tions include: 


AC adverbial clause PLPF pluperfect 

ASP aspect PROSP prospective 
coLL collective PRTT  partitive 

DI desiderative-intentional (mood) REF referential 
DISTRIB distributive S sentence/clause 
IPD impeditive SBEN self-benefactive 
MC main clause SEQ ` sequential 

MID middle voice SS same subject 
NACT  non-actual (irrealis) mood TC topic-contrast 
PART particle VENT  ventive 


PERS personal 


Appendix: Language sample 


AFRICA: Fongbe, Hausa, Jamsay, Kana, Khwe, Konso, Koyra Chiini, Krongo, Lan- 
go, Mayogo, Mbay, Nkore Kiga, Noon, Supyire, Tamasheq. 

NORTH AND CENTRAL AMERICA: Choctaw, (Barbarefio) Chumash, Kiowa, Lakota, 
(Chalcatongo) Mixtec, Musqueam, Ojibwe, Purépecha, Rama, Slave, Tepehua, (Ja- 
mul) Tiipay, Tümpisa Shoshone, Tzutujil, Wappo, West Greenlandic. 
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SOUTH AMERICA: Aguaruna, Awa Pit, Barasano, Cavineña, Epena Pedee, Hup, 
Jarawara, Kwazá, Mapudungun, Matsés, Mekens, Mosetén, Ndyuka, (Huallaga) 
Quechua, Tariana, Trumai, Urarina, Warao, Wari, Yagua, Yuracaré. 

Eurasia: Abkhaz, Ainu, (Gulf) Arabic, Basque, Evenki, French, Georgian, Ger- 
man, Hungarian, Japanese, Korean, Lezgian, Malayalam, Marathi, Persian, San- 
tali, Serbo-Croatian, Turkish, (Kolyma) Yukaghir. 

SOUTH-EAST ASIA AND OCEANIA: Burmese, Hmong Njua, Begak Ida'an, (Karo) 
Batak, Lao, Mandarin Chinese, Dolakha Newar, Qiang, Semelai, Taba, Tetun, To- 
qabaqita, Tukang Besi, Vietnamese, Yakan. 

AUSTRALIA AND NEW GUINEA: Gooniyandi, Imonda, Kayardild, Kewa, Korafe, 
Lavukaleve, Mali, Mangarayi, Menya, Motuna, Martuthunira, Ungarinjin, Wam- 
baya, Yimas 
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Chapter 6 


Attractor states and diachronic change 
in Hawkins’s "Processing Typology” 


Karsten Schmidtke-Bode 
Leipzig University and Friedrich Schiller University Jena 


This paper provides an assessment of John Hawkins's (2004; 2014) programme of 
explaining cross-linguistic regularities in terms of functional-adaptive principles 
of efficient information processing. In the first part of the paper, I systematize how 
such principles may possibly affect the diachronic development of languages, and 
I argue that evidence for efficient coding can be obtained primarily from the ac- 
tualization process, rather than the innovation stage that is at the focus of purely 
source-based approaches to explaining universals. In the second part of the paper, 
I present a small case study on a specific prediction made in Hawkins (2014), con- 
cerning the typology and diachrony of article morphemes. This will allow us to 
carve out both strengths and weaknesses of Hawkins's programme in its current 
manifestation. 


1 Introduction 


In debating the role of source- and result-oriented explanations in typology, a 
research programme that merits discussion is John Hawkins's approach to cross- 
linguistic variation, laid out most comprehensively in Hawkins (1994; 2004; 2014). 
The overarching hypothesis of these works is that many cross-linguistic gener- 
alizations about grammatical structure can be explained as adaptations to effi- 
cient information processing (“processing typology”, see Hawkins 2007). In a 
nutshell, Hawkins argues that efficient information processing can be achieved 
by (i) *minimizing domains" in which certain semantic and syntactic relations 
are processed, (ii) “minimizing forms” whenever their information content is re- 
coverable from the context or long-term statistical knowledge, (iii) arranging 
elements in such a way that the ultimate message can be transmitted as rapidly 
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and accurately as possible, ie. without delays, false predictions, backtracking, 
etc. These efficiency principles are thus attractors that are assumed to affect lin- 
guistic choices in usage events and ultimately also the conventionalized shapes 
of grammars. 

Hawkins's programme is one of the most systematic attempts to ground typo- 
logical data in psycholinguistic research and to link it to the arena of language 
use; in this spirit, it is similar, for example, to the work of Bybee (1985; 2010) 
and Croft (2001; 2003). Moreover, Hawkins's “performance-grammar-correspon- 
dence hypothesis", according to which grammatical rules are basically crystal- 
lized usage preferences, echoes one of the key tenets of the usage-based theory 
of language (Langacker 1987; Kemmer & Barlow 2000). And some specific effi- 
ciency principles, such as the "minimization of forms" in proportion to their de- 
gree of predictability, even have exact parallels in other functional-typological 
works (e.g. Haiman 1983; Croft 2003; Haspelmath 2008). 

Atthe same time, however, Hawkins's work is not always received uncritically 
within usage-based linguistics. Among other things, it is couched in a formal 
phrase-structure architecture that appears to presuppose the existence of many 
grammatical categories (see Diessel 2016); some of its principles have been criti- 
cized for not being truly domain-general but perhaps specific to language (such 
as a pressure for short constituent recognition domains, see Bybee 2010); and cru- 
cially in the present context, Hawkins has also been criticized for neglecting or 
underestimating the diachronic dimension behind the phenomena he attempts 
to explain (e.g. Cristofaro 2017; Collins 2019 [this volume]). But to the extent that 
Hawkins does make reference to historical developments, the nature and plau- 
sibility of his diachronic claims are worth investigating in more detail, which is 
precisely what the present contribution aims to do. 

To this end, the first part of the paper develops a systematization, in the usage- 
based framework, of how Hawkins's functional-adaptive principles possibly af- 
fect the diachronic development of languages. I argue that there is solid evidence 
for efficient information processing in the moulding of grammar, suggesting that 
there is a place for result-oriented processes, beside source determination, in ac- 
counting for typological distributions. In the second part of the paper, I exem- 
plarily focus on a diachronic prediction made in Hawkins (2014), according to 
which languages of different word-order types show markedly different propen- 
sities for grammaticalizing definite articles (the prediction will be formulated 
more precisely as we go along). This miniature case study will not only serve as 
a testing ground for this specific efficiency-based hypothesis, but also allow us 
to identify some general merits and potential problems of processing typology. 
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2 The diachronic dimension in processing typology 


In the usage-based approach, language change is conceived as a multi-step pro- 
cess (Croft 2000; 2006; Aitchison 2013) that starts by breaking a convention in the 
form of a linguistic innovation ("altered replication"), followed by the spread of 
that innovation through both the linguistic system ("diffusion") and the speech 
community (“propagation”). Hawkins's publications contain a number of indi- 
cations as to how his efficiency principles influence innovation, diffusion and 
propagation processes. I will tackle each of them briefly, in reverse order, as this 
reflects an increasing degree of explicitness of the respective proposals.! 

As for PROPAGATION processes, Hawkins is usually reticent with regard to the 
forces that implement efficient structures, even though the central diachronic 
mechanism in his programme is that of "selection": Efficient variants are said 
to be selected relatively more frequently than their inefficient counterparts, un- 
til they may ultimately oust the inefficient ones completely. It is in this way, 
Hawkins argues, that preferred patterns in performance can conventionalize into 
grammatical rules, although he concedes in Hawkins (2014: 10) that it is presently 
poorly understood how exactly this "translation from performance to grammar" 
works. Now, if one subscribes to the view that propagation is entirely driven 
by sociolinguistic forces like prestige, solidarity and the resulting accommoda- 
tion (e.g. Croft 2000; Cristofaro 2017; Cristofaro 2019 [this volume]), it remains 
mysterious, indeed, how Hawkins's very idea of selection processes can fit in. 

On the other hand, there are well-known accounts of language change in 
which propagation is not exclusively a social phenomenon: Keller's (1994) “Invis- 
ible Hand" theory, for example, leaves room for functional considerations in the 
selection process. Some of Keller's classic examples of invisible-hand processes, 
such as the emergence of a traffic jam or a short-cutting footpath do not, in fact, 
involve social motives: People follow a certain course of action because they pri- 
marily consider its functional advantages, regardless of the sociolinguistic profile 
of the person whose behaviour they adopt. Cristofaro (2017) claims that there is 
no empirical evidence at all for this scenario in linguistics, but this assessment 
is overly pessimistic: Rosenbach (2008), in a detailed examination of evolution- 
ary accounts of language change, concludes that "the evidence available does not 
speak for the exclusive role of social factors in the selection process" (Rosenbach 


‘Although the present section is specifically about Hawkins’s work, it actually applies to 
"functional-adaptive constraints" (Haspelmath 2019 [this volume]) more generally, not least be- 
cause Hawkins's processing typology draws on and incorporates similarly-minded principles 
from many other functionalist typologists (e.g. Greenberg, Comrie, Keenan, Givón, Haiman, 
Croft, Haspelmath, etc.). 
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2008: 44; emphasis in original). Therefore, I currently see no reason to dismiss 
a priori a theory in which both social and functional selection pressures can be 
operative in propagation (see also Haspelmath 1999; Nettle 1999; Enfield 2014 for 
similar positions).” On this view, then, Hawkins's efficiency principles are rel- 
evant to, and hence at least partially drive, the propagation process, although 
empirical evidence that clearly disentangles functional and social selection pro- 
cesses is, of course, very hard to come by (see also Seiler 2006). 

The empirical picture is clearer, in my view, when it comes to DIFFUSION or 
ACTUALIZATION processes, i.e. the spread of an innovation through the linguistic 
system.? Although Hawkins himself does not speak of diffusion or actualization, 
the process is actually highly germane to his research, as many of the phenom- 
ena he discusses in support of his efficiency theory are cases of limited diffusion. 
In relativization, for example, a well-known pattern is for resumptive pronouns, 
once they have been innovated, not to spread across the entire range of relativiza- 
tion sites, but to be restricted to certain sections of Keenan & Comrie's (1977) 
accessibility hierarchy (as in Hausa, Hebrew, Welsh and many other languages). 
Similarly, when object case markers develop and spread within the linguistic 
system, they typically end up being confined to animate, definite or pronominal 
objects, rather than being extended across the board (see, e.g., Sinnemäki 2014 
for a quantitative study). 

Many other cases of such differential marking are collected in Haspelmath 
(2008) and subsumed by Hawkins (2004; 2014) under his “Minimize Forms" prin- 
ciple: The marker in question is applied to those environments that require more 
processing effort, and is left out economically elsewhere. Processing effort, in 
turn, may be related to various factors, notably constraints on working memory 


"Note that recent mathematical models of language change (e.g. Blythe & Croft 2012) clearly 
show that selection as such is a crucial element of propagation processes, in as far as alternative 
models of propagation that do not rely on a weighting of linguistic variants (e.g. Trudgill 2004) 
do not produce the empirical patterns of propagation that have been established in historical 
linguistics and sociolinguistics. However, Blythe & Croft (2012) also concede that their model 
cannot distinguish between social and functional factors in selection, i.e. it leaves open which 
of these is more vital in the propagation process or how they possibly interact. 

“The term DIFFUSION is best-known in the context of "lexical diffusion" (Wang 1969), which 
refers to the successive spread of a phonetic or morphosyntactic innovation to different lexical 
items (e.g. the Progressive construction to more and more lexical verbs, or final consonant 
devoicing to all relevant words). In the present paper, I am using the term diffusion in a broader 
sense, comprising also the application of an innovated grammatical marker or construction to 
a new morphosyntactic environment (e.g. the extension of all but in its historically younger 
sense ‘almost’ from adjectival uses (This was all but remarkable) to verbal environments (He all 
but fell down), see De Smet 2012). Diffusion is thus synonymous with the term ACTUALIZATION 
(Timberlake 1977, Andersen 2001 and many others, most recently De Smet 2012). 
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(e.g. longer processing domains correlating with resumptive pronouns) and the 
relative unexpectedness (‘surprisal’) of a given configuration (see Norcliffe et al. 
2015 on memory- and expectation-based processing in cross-linguistic perspec- 
tive). For example, discourse participants and animate entities are more likely to 
be subjects than objects, hence it is precisely these kinds of objects that are more 
surprising. Paired with the Hawkinsian assumption of efficiency on the part of 
the speaker, it is also only these objects that need to be marked overtly. A similar 
surprisal-based account is provided by Haig (2018) to explain “why differential 
object indexation is an attractor state" (Haig 2018: 781) in the grammaticalization 
of object pronouns. 

In Hawkins's programme, then, all of these cases are amenable to an expla- 
nation in terms of efficient information processing. I believe that this account is 
presently superior to purely source-oriented typologies such as Cristofaro's, for 
the following reasons. 

Firstly, there is solid evidence for efficiency where the occurrence of a particu- 
lar marker is optional. This can be observed, for example, with variable relativiz- 
ers that, other things being equal, show up less frequently when a relative clause 
is statistically expected given the previous co-text, and vice versa (Wasow et al. 
2011). As Fox & Thompson (2007) observe, a sentence like 


(1) This was the ugliest set of shoes [I ever saw in my life]. 


would sound *quite awkward" (Wasow et al. 2011: 181) if the relative clause were 
introduced by that; according to Wasow et al., this is precisely because a relative 
clause is expected in this context, which is in turn why relative that tends to be 
omitted efficiently. Jaeger (2010) shows that similar predictability effects account 
for a large portion of the variability of the English complementizer that. 

Importantly, the same kinds of effect also show up in psycholinguistic exper- 
imentation, and in languages other than English. For example, recent studies 
have shown that optional case marking in Japanese, optional indexation in Yu- 
catec Maya relative clauses or optional plural marking in an artificial language 
exhibit an efficient distribution in the participants' linguistic behaviour, other 
things being tightly controlled for (Kurumada & Jaeger 2015; Norcliffe & Jaeger 
2016; Kurumada & Grimm 2017). All of these synchronic effects are independent 
of the historical source of the respective marker. In other words, no matter how 
a particular relativizer emerges, it comes to be applied in ways that are conso- 
nant with Hawkins's efficiency predictions. And as, for example, SerZant (2019 
[this volume]) shows, such optional marking can conventionalize into more fixed 
grammatical patterns over time. 
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Secondly, to the extent that the "Minimize Form" effects are typologically 
sound (i.e. independent of geographical and genetic affiliations), they are in con- 
trast with a powerful principle that we observe elsewhere in grammars, viz. the 
potent force of analogy (see Gentner & Smith 2012; Blevins & Blevins 2009). Anal- 
ogy is the driving force behind lexical diffusion, and where it runs to (near) com- 
pletion, the result is a productive grammatical rule in the traditional sense. Time 
and again, historical studies show just how sweeping analogical extension can 
be: By incremental diffusion processes, English has conventionalized a rule that 
every main clause requires an overt subject, and every lexical verb now needs 
do-support if it is to occur in an interrogative clause. In other languages, split 
alignment systems are gradually being eliminated in favour of unified marking: 
for example, younger speakers of Choctaw (Muskogean: USA) are in the pro- 
cess of re-shaping split intransitivity into a nominative-accusative system with 
consistent coding for the S argument (Broadwell 2006: 140); and Creissels (2018) 
argues more generally that there are strong analogical pressures on languages 
to regain consistent alignment patterns if these get disrupted by grammaticaliza- 
tion processes.* 

In view of these analogical forces, one may wonder why systems of differential 
resumption, differential object marking or differential possessive marking exist 
quite pervasively, and in highly systematic ways. In Cristofaro's account, they are 
persistence effects, i.e. they are all due to the fact that the unmarked meanings 
are perceived as incompatible with the source construction. For example, when 
an object marker originates from a topic marker, it is expected to be restricted to 
object NPs whose properties are most closely associated with topicality (or topic- 
worthiness), such as pronominal, animate and definite entities, and not to apply 
elsewhere. In fact, Dalrymple & Nikolaeva (2011) argue that such erstwhile topic 
markers are often extended to animate and/or definite objects, thus diffusing 
in principled ways to create DOM patterns that may plausibly be linked to the 
source construction. However, given the powers of analogy, why does diffusion 
stop there? If it is really the source construction pulling its weight here, one may 
wonder why it does not do so in many other cases. 

Just consider what is perhaps the textbook example of a development that stan- 
dardly overrides effects from the source construction, namely diffusion processes 
in grammaticalization. It is by analogical extension that the going-to-future has 
spread to inanimate subjects (The icicle is going to break off.), and that the French 
negative marker pas has been extended beyond contexts of directed motion (7e 


“For example, “many languages in which the grammaticalization of a new TAM form resulted 
in [tense-based split ergative alignment] have undergone a subsequent evolution that can be 
characterized as regularization under the pressure of analogy" (Creissels 2018: 81). 
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ne vais pas. Tm not going. > Je ne sais pas. 1 don't know.). In other words, if 
analogy works in many other instances of grammaticalization, why not in those 
cases that involve differential marking? Siding with Hawkins (and Haspelmath 
2019 [this volume]) here, I find it convincing that an attractor state of efficient 
coding shapes the development of grammatical systems, especially in light of the 
behavioural evidence cited above. 

Thirdly, while Iagree with Cristofaro (2019 [this volume])) that functional prin- 
ciples should be “visible” in the diachronic development of particular structures, 
I find her interpretation of this requirement too narrow: She demands that the 
alleged motivations be present at the innovation stage of a grammatical construc- 
tion and hence directly influence its emergence. But as we have just seen, it is 
often during the actualization phase that functional-adaptive principles are oper- 
ative, regardless of how or why a given marker originated in the first place (see 
also Serzant 2019 [this volume]). 

Interestingly, while Joan Bybee is now often cited as a representative of source- 
oriented typology, her relevant publications reveal a broader perspective than 
Cristofaro's: "Identifying the causal mechanisms [that lead to typological gen- 
eralizations] requires a detailed look at all the properties of a change - includ- 
ing its directionality, gradualness, spread through the community and through 
the lexicon" (Bybee 2008: 108; see also Bybee 1988). Crucially, it is precisely in 
lexical-diffusion processes that many of her well-known frequency effects ap- 
ply: For example, Bybee's “conserving effect" of token frequency explains why 
highly entrenched main verbs like speak, think and mean resisted the innovative 
do-support in wh-questions for a long time (e.g. What spekest thou?, see Ogura 
1993), or why the change from -th to -s in the third-person of English verbs af- 
fected the most frequent verbs last (notably hath and doth, see van Gelderen 2014: 
172). In a similar vein, I would thus argue that the diffusion or actualization stage 
is highly relevant for the kinds of effect that lead to efficient typological marking 
patterns. 

In conclusion, I consider Hawkins's account (and functional-adaptive motiva- 
tions of similar kinds) capable of explaining why certain changes do not happen 
- particularly, why we find that analogical extensions are systematically brought 
to a halt even though they are so commonly carried through in other domains of 


grammar.” 


“See also Smith (2001) for a similar view: He investigates the diachronic development of agree- 
ment loss in Romance participles and argues that while parsing principles cannot be held re- 
sponsible for the rise of participial agreement, they did play a role in its gradual disappearance. 
Specifically, Smith claims that agreement was retained longest in those environments where 
it was most beneficial for processing. Therefore, “functionality is here acting as a brake on 
actualization” (Smith 2001: 214), just as I argued more generally above. 
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Let us finally turn to the realm of INNOVATION, i.e. Hawkins's suggestions as 
to why, where and when certain grammatical structures emerge. A central con- 
cept here is that of correlated evolution: When a language changes in one part 
of the grammar, Hawkins often expects to see "ripple effects" (Hawkins 2014: 
88) in domains that are linked to the changing subsystem by certain efficiency 
principles. For example, since Hawkins assumes that phrases of different types 
(VPs, PPs, NPs, etc.) show harmonic ordering patterns to allow efficient sentence 
processing, a change from OV to VO is predicted to engender innovations in PPs 
and NPs as well (see Dunn et al. 2011 and the papers in Linguistic Typology 15(2) 
for ample discussion of this issue). In the present context, perhaps the most in- 
teresting claim with regard to innovation is that efficiency principles can predict 
the occurrence of grammaticalization: While many grammaticalization paths are 
universal "attractor trajectories" (Bybee & Beckner 2015) - open to all languages 
with similar source constructions due to the same mechanisms of reanalysis -, 
Hawkins's efficiency principles predict under which structural conditions (e.g. in 
which language "types") particular events of grammaticalization are more or less 
likely to happen. In the remainder of this paper, I will briefly discuss a specific 
example of such a hypothesis developed by Hawkins (2014). 


3 A test case for processing typology 


In his (2014) monograph, Hawkins examines the structure of noun phrases (NPs) 
from a processing perspective. Across the world's languages, NPs often contain 
elements in addition to the head noun that, in Hawkins's view, can function as 
processing cues to the recognition (or online “construction”) of an NP, such as 
articles, classifiers and related morphemes. Hawkins argues that such elements 
are more efficient in VO languages than in OV languages: As illustrated schemat- 
ically in Figure 1, an additional NP constructor C in a VO language can shorten 
the domain for the construction of the VP (V+NP), especially if N is delayed by 


$ Although Hawkins frames the idea of “online construction" in terms of syntactic trees, nodes 
and categories, the basic intuition behind it is functional in nature: Translated into the usage- 
based parlance of, e.g., Croft (2001), Beckner & Bybee (2009) or Bates & MacWhinney (1989), 
Hawkins's idea is that a referential expression should be recognizable as such, based on reli- 
able cues in the speech stream. Referential expressions (or NPs, for that matter) are arguably 
best cued by nouns and determiners, and the construction of an NP is thus facilitated by the 
early availability of such "constructing categories" within the string of units that ultimately 
belong to the NP. More generally, I believe that Hawkins is thus actually quite compatible 
with usage-based and construction-grammatical conceptions of processing, even though he 
uses terminology that is closely associated with generative syntax. 
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intervening material (e.g. in AP-N sequences like the very delicious meal). In an 
OV language, by contrast, additional NP constructors lengthen this dependency 
domain, no matter where they occur in the NP: 


VO languages OV languages 
ve [V wp [N ...] [[...N...C ] np V] vp 
vr [V wp [C ... N ...]] [[... C ... N ] np V] vp 


Figure 1: V-NP processing in VO- and OV-languages (adapted from 
Hawkins 2014: 125) 


From these considerations, one might derive the following prediction: 


(2 While all languages have source constructions for articles (notably demon- 
stratives for definite articles and the numeral 'one' for indefinite articles), 
the grammaticalization of these sources into more general NP markers 
should be a more productive historical process in VO languages than in 
OV languages. As a result, the synchronic typological distribution of arti- 
cles is significantly different in the two language types. 


As a matter of fact, Hawkins' (2014) prediction is narrower in scope: He applies it 
only to definite articles, and only to independent definite articles (i.e. words and 
clitics rather than affixes). The following examples illustrate the language types 
that are expected to be frequent according to (2): 


(3 a. VO with definite article 
Maori (Austronesian, Oceanic; Bauer 1993: 256) 
I kiteia i te whare. 
T/A see 3sG OBJ DET house 
“She saw the house: 
b. OV without definite article 
Lezgian (Nakh-Daghestanian, Lezgic; Haspelmath 1993: 343) 
Ada-z balk'an aku-na. 
he-DAT horse see-AOR 


‘He saw the horse. 
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In support of the hypotheses in (2), Hawkins cites some of Dryer's (2005) WALS 
data, which show, indeed, that definite-article words are relatively more frequent 
in VO-languages (more on the data below). 

Hawkins's approach, as illustrated by this specific example, has a number of 
assets: For instance, it emphasizes the importance of the linear dimension of 
language, which tightly constrains production and parsing processes but which 
tended to be neglected by (at least early) cognitive-linguistic and construction- 
based approaches to grammar (see also Diessel 2011 for a similar critique). Haw- 
kins's work is clearly pioneering here, and in the recent usage-based literature, 
related notions like contextual predictability, informativity and projective links 
have come to take a highly prominent place (see, e.g., Gahl & Garnsey 2004; 
Levy 2008; Auer 2009). Furthermore, Hawkins's diachronic thinking adds a new 
dimension to classic research in grammaticalization. As Good (2008: 7) points 
out, work on grammaticalization typically offers “permissive explanations [...], 
that is, it focuses on particular grammaticalization paths without, in general, ac- 
counting for what factors will cause one language, but not another, to instantiate 
those paths? Hawkins's approach elevates this “permissive” nature of explana- 
tion to what Good (ibid.) calls a "probabilistic" one: It attempts to explain why 
certain grammaticalization processes are set in motion only (or preferably) in 
certain language types or at certain points in time (see also Hawkins 1986; 2012 
for representative work along these lines). 

But just how convincing are such claims and the empirical support that Haw- 
kins provides for them? In the present case, Ihave a number of reservations about 
the picture drawn in Hawkins (2014). 

To begin with, I do not quite see why the hypothesis is restricted to the de- 
velopment of definite articles, as indefinite articles should qualify equally well 
as NP constructors. Similarly, Hawkins's preoccupation with word-based pro- 
cessing (which is prominent throughout his 2014 book), to the neglect of affixes 
with identical functions, is not sufficiently motivated. In addition to the prob- 
lem that free and bound markers are very hard to distinguish consistently for 
cross-linguistic comparison (Haspelmath 2011), it remains unclear if there is a 
measurable psycholinguistic difference between word- and affix-processing. As 
long as there is no evidence for the view that free and bound definiteness mark- 
ers are parsed in fundamentally different ways, we should rather take a more 
embracing approach to the data and ask whether VO- and OV-languages differ 
in their propensity to grammaticalize article morphemes from their respective 
source constructions. 
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With these considerations in mind, the first step of the empirical assessment 
is, just like in Hawkins (2014), to examine the typological distribution of arti- 
cle morphemes. Dryer's WALS data, in their most recent version, are set out in 
Table 1. 


Table 1: Distribution of articles in different word-order types (Dryer 
2013a,c) 


VO OV ndo Totals 
Distinct ART word 144 52 14 210 


DEF affix 49 33 6 88 ART 
DEM used as ART 30 33 5 68 

Only INDEF ART 20 24 0 44 

No ART 70 111 14 195 NO ART 
Totals 313 253 39 605 


For the purposes of testing our revised version of Hawkins's hypothesis, we 
need to discard the languages without a dominant order of V and O ("ndo"), and 
we basically conflate the figures in the first four rows of Table 1 and contrast them 
with those in the final row. In other words, (i) we consider both free and bound 
definiteness morphemes; (ii) we include those languages which are beginning to 
use a demonstrative like an article (row 3, see Dryer 2013a for details) — thus 
incorporating cases of incipient grammaticalization; (iii) we include languages 
with indefinite articles only.” The conflated form of the data thus looks like in 
Table 2. 

The distribution in Table 2 looks conspicuously skewed, but of course these are 
raw data that are not controlled for genetic and areal effects.? Therefore, what 
Hawkins's (2014) analysis clearly needs to be augmented with (in this case as well 
as virtually all others in his book) is proper statistical modelling according to con- 
temporary standards (see, e.g., Bickel 2011). To this end, I am seeking converging 
evidence from two complementary quantitative approaches to the data, namely 
mixed-effects logistic regression (see also Cysouw 2010; Jaeger et al. 2011) and 


"Some readers may object to this way of grouping the data. For example, one might reason- 
ably argue that languages in which demonstratives are used with some article-like functions 
should not be said to have "proper" articles (yet). However, even when such languages are 
classified differently for statistical purposes, the results remain the same in many respects (see 
supplementary material SM3.2). 

*For similar raw data, see also Dryer (2009), who endorses Hawkins's processing explanation. 
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Table 2: Distribution of articles in different word-order types (reorga- 


nized) 
VO OV Totals 
ART morph 243 142 385 
No ART morph 70 111 181 


Totals 313 253 566 


Bickel's (2011; 2013) Family Bias Method (which is particularly suitable to test- 
ing hypotheses formulated in diachronic terms). In the supplementary materials 
to this paper”, I offer a more detailed, non-technical introduction to the Family 
Bias Method (SM1), as well as the statistical properties of all models (SM2-5). For 
reasons of space, I here confine myself to describing some major results of the 
analyses.? 

Figure 2 shows that there is a significant effect of word order on the occurrence 
of articles in a mixed-effects regression model (f = -0.73, p < 0.001). Although 
the model is not particularly good overall, probably missing important further 
predictors (R?, = 0.14, C = 0.72), Hawkins's hypothesized effect is clearly present, 
as the probability of not having articles (y-axis) increases significantly as we go 
from VO to OV (x-axis). 

In the Family Bias estimations, too, it turns out that, among those families that 
do not just show a chance distribution of articles, VO families are about 2.6 times 
more likely to develop articles than OV families. This is illustrated in Table 3, 
and Figure 3 shows that this effect is stable (i.e. never reversed) across all six 
macro areas. In sum, the global typological picture is consistent with Hawkins's 
processing account, even when tested against a more comprehensive data set and 
with more rigorous modes of examination. 


“See http://www.kschmidtkebode.de/publications or http://doi.org/10.5281/zenodo.2577480. 

10 All statistical analyses were performed in R 3.3.1 (R Development Core Team 2016). I am grate- 
ful to Taras Zakharko and Balthasar Bickel for making their Family Bias algorithm freely avail- 
able (Zakharko & Bickel 2011ff.). 

"All regression analyses I performed are based on generalized linear mixed-effects models that 
include genealogical and macro-areal dependencies as random effects (see SM3). The model in 
Figure 2, for example, contains by-family and by-area random intercepts for the distribution 
of articles, while a by-area random slope for the word-order effect did not improve the model 
significantly and was hence excluded from the final model. 
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Figure 2: Effect of word order type on the probability of (not) having 
articles in a mixed-effects model (see SM3 for details) 


Table 3: (Rounded) family biases for articles in different word-order 
types (Niotal = 217 genetic units, 99 of which are estimated to be “biased” 
(as opposed to internally diverse); Fisher exact test, p = 0.039) 


vo OV Totals 
ART morph 50 19 69 
No ART morph 15 15 30 


Totals 65 34 99 


Recall, however, that a second prediction of this account is that articles are es- 
pecially useful in those VO languages that have modifiers before the head noun 
in NPs (a very delicious meal). One would thus expect, for example, that the gram- 
maticalization of articles is particularly productive in VO languages with ADJ-N 
order, and, from an efficiency perspective, less so in those with N-ADJ order. I 
tested this by examining the order of nouns and adjectives (Dryer 2013b) in all 
VO languages in the same sample as above (Niotaj = 278 languages). 
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Figure 3: Family biases by macro area (see SM2 for details) 


Across several different statistical models (and operationalizations of the hy- 
pothesis, see SM4), I did not find support for Hawkins’s efficiency hypothesis. 
In one analysis, for example, I probed whether article words are more likely in 
VO languages with ADJ-N order than in those with N-ADJ order. Figure 4 shows 
that this is neither the case for definite articles nor for articles in general.” 

Clearly, this picture does not speak for a critical processing pressure being at 
work. And the same conclusion actually carries over to OV languages: It is true 
that, to the extent that these languages show a reduced propensity for developing 
articles, they manage to keep NP processing domains slightly shorter; but there 


? Moreover, if we look at VO languages which are beginning to use a demonstrative as a definite 
article (N = 26 in Dryer 2013a), Hawkins’s account would lead us to expect that such incipient 
grammaticalization is particularly frequent in the constellation DEM-N and ADJ-N (and again 
less frequent if N precedes both the ADJ and the DEM). Now, of the 26 languages in question, 
22 are N-DEM and four are DEM-N. It is the latter type that is interesting here, and we find 
that two of these four languages are ADJ-N and the other two N-ADJ. Again, no clear pattern 
along Hawkinsian lines can be detected here. 
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Figure 4: Occurrence of articles in VO languages depending on the 
position of adjectives (left plot: definiteness words only; right plot: 
all article-like words; for the corresponding mixed-effects models, see 
SMA) 


are several indications that this pressure cannot be particularly strong. 

First, our Family Bias calculations show, in addition to the findings from above, 
that none of the large OV families in the sample actually exhibits a significant bias 
(towards or against articles) in the first place; they are all internally diverse, i.e. 
with no more than chance distributions of articles (Table 4).? 


Table 4: Distribution of biases (for or against) articles among large fam- 
ilies in the sample (Nat = 29 genetic units) 


VO OV Totals 
significantly biased 12 0 12 
internally diverse 6 11 17 


Totals 18 11 29 


Second, from a more qualitative perspective, there is suggestive evidence that 
a potential efficiency motivation in OV languages is easily overridden by other 
factors. For example, Ross (2001) discusses an interesting case of an intense con- 
tact situation in which the Austronesian language Takia adapted its VO syntax 


PBickel (2013) suggests that the (minimum) strength of a universal pressure can be calculated 
on the basis of the proportion of biased families k among all families of a particular kind n 
(here: OV families): $ = (k+1)/(n+2). Based on the figures in Table 4, we obtain ŝov) = (0+1)/ 
(11+2) = 0.077. This estimate is so small in magnitude that one is forced to conclude that there 
is no particular pressure at all on OV languages with regard to the development of articles. 
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to the OV structure of its Papuan contact language Waskia. Ross argues that, in 
the wake of this restructuring, Takia speakers must have shed the prenominal 
article word in NPs (see 4a), which would be fully in line with Hawkins's predic- 
tion. At the same time, however, the degree of linguistic accommodation was so 
intense that Takia speakers also did something else: They grammaticalized a post- 
nominal deictic element into a postnominal demonstrative with some article-like 
functions, reproducing exactly the article pattern in Waskia (see 4b-c). 


(4) a. Proto-Western Oceanic (Ross 2001: 142) 
a  tam"ata a-fia 
DET man that-3sc 
“that man' 


b. Takia (Austronesian, Oceanic; Ross 2001: 140) 
Waskia tamolan 
Waskian man that 


“that Waskia man' 
c. Waskia (Nuclear Trans New Guinea, Madang; Ross 2001: 140) 


Waskia kadi mu 
Waskia man that 


“that Waskia man' 


In other words, Takia speakers chose precisely the diachronic route that Haw- 
kins would predict to be disfavoured, which goes to show that the alleged pro- 
cessing pressure cannot have been very strong, after all. In this connection, one 
may also recall that our regression model from above, while bringing out a signif- 
icant global effect from word order type, did not provide a particularly good fit 
to the data. The substantial amount of variation in the data that it cannot account 
for must thus be attributed to other, possibly stronger factors. 


“Diachronic research has actually put forward a number of plausible candidates for such fac- 
tors. A prominent one since at least Vennemann (1975) is the loss of a case system and the 
concomitant rigidification of constituent order, which favours the development of articles to 
express information-structural distinctions that were previously coded by a more flexible word 
order (see also Hawkins 2004; Hewson & Bubenik 2006; Fischer 2010; Carlier & Lamiroy 2014). 
Another possible factor is the loss of an aspectual system (Abraham 1997; Leiss 2000; 2007). 
However, especially the former type of explanation is often viewed critically (e.g. Selig 1992 
on Romance; McColl Millar 2000 on English; Leiss 2000 on Germanic), and there is currently 
no proposal as to how various factors may conspire to explain the synchronic distribution 
of articles (see also Lüdtke 1991). For some further information and preliminary typological 
analyses of these factors, interested readers are kindly referred to SM5. 
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We conclude, then, that Hawkins (2014) correctly predicts a global difference 
between OV- and VO-languages in the development of articles. But the present 
analysis also revealed some challenges for this account. Therefore, it still needs to 
be established by future research whether the global correlation between word 
order type and the absence of articles really reflects a causal connection between 
these two phenomena, and whether this could be attributed to efficient informa- 
tion processing. If it turns out that Hawkins is correct, the findings in the present 
section suggest that we would be dealing with a “weak universal pressure’ in the 
sense of SerZant (2019 [this volume]) or a ^weak cognitive bias" with "significant 
population-level consequences" (Thompson et al. 2016: 4530). 


4 Concluding remarks 


The present contribution has taken a closer look at John Hawkins's “process- 
ing typology”, a research programme that fully subscribes to functional-adaptive 
motivations for grammatical structure. In the first part of the paper, I discussed 
where such motivations are possibly operative in diachronic change. In my view, 
acase can be made for Hawkins's efficiency considerations in the process of actu- 
alization, i.e. when a linguistic innovation comes to be extended to a principled, 
cross-linguistically similar subset of potential application sites (as in differential 
flagging and indexing, relativization, etc.). In this respect, I consider Hawkins's 
account as superior to purely source-oriented explanations of grammatical pat- 
terns. Of course, this does not deny that persistence accounts are relevant to ty- 
pological patterns - they clearly are; but it argues against persistence as the sole 
or perhaps even the dominant explanatory principle for grammatical universals. 

A more ambitious but also undoubtedly more problematic move is to link pars- 
ing and efficiency to certain innovation processes, such as when a particular 
grammaticalization channel is predicted to be set in motion only under specific 
structural conditions. In the brief case study presented here, we saw that Haw- 
kins's NP processing hypothesis provides a neat match to the global typological 
data, even when these are analyzed in more rigorous and hence more appro- 
priate ways than in Hawkins (2014). On the other hand, the details of neither 
the typological picture nor individual diachronic studies produce evidence for a 
strong pressure on languages to develop into the predicted directions. Therefore, 
the hypothesis that speakers of OV languages are significantly less inclined than 
speakers of VO languages to grammaticalize additional NP constructors, remains 
plausible but currently rather weakly substantiated. 
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What we would need to see to make it more convincing is a triangulation of (i) 
typological data that are large enough to take several alternative predictor vari- 
ables from the literature into account (e.g. case and aspect systems, the presence 
of other NP constructors such as classifiers), (ii) diachronic data from languages 
that have undergone (or are in the process of undergoing) changes in basic word 
order, (iii) behavioural evidence, such as psycholinguistic experimentation with 
artificial languages (e.g. along the lines of Culbertson et al. 2012; see also Lev- 
shina 2019 [this volume]). As a matter of fact, a particularly strong aspect of 
Hawkins's work (especially in Hawkins 2004; 2014) is that it generally attempts 
precisely this kind of methodological cross-fertilization; but for the domain at 
issue here, such an approach has yet to be fleshed out in sufficient detail. 


Abbreviations 
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tense/aspect marker 
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Weak universal forces: Ihe 
discriminatory function of case in 
differential object marking systems 


Ilja A. Serzant 
Leipzig University 


Standard typological methods are designed to test hypotheses on strong universals 
that broadly override all other competing universal and language-specific forces. 
In this paper, I argue that there exist also weak universal forces. Weak universal 
forces systematically operate in the course of development but then interact with, 
or are even subsequently overridden by, other processes such as analogical exten- 
sion, persistence effects from the source function, etc. This, in turn, means that 
there can be statistically significant evidence for violations at the synchronic level 
and, accordingly, only a weak positive statistical signal. But crucially, the absence 
of statistical prima-facie evidence for such forces does not amount to evidence for 
their absence. The assumption that there are also weak universal forces that affect 
language evolution goes in line with the view that human cognition in general and 
language acquisition in particular are constrained by probabilistic biases of differ- 
ent range, including weak ones (cf. Thompson et al. 2016). By way of example, the 
present paper claims that the discriminatory function of case in differential object 
marking (DOM) systems is a weak universal: It keeps appearing in historically, syn- 
chronically and typologically very divergent constellations but is often overridden 
by other processes in further developments and is, therefore, not significant at the 
synchronic level in a large sample. 


1 Introduction 


In this paper, I adopt a dynamic approach to universals (Greenberg 1978) and, 
accordingly, the following definition of a universal: 
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(1) A dynamic definition of universals 
principled preferences that affect how languages change over time 
(Bickel 2011: 401). 


I conceive of these preferences as statistical tendencies (cf. Bickel 2011) rather 
than "inviolable constraints" on language in Kiparsky (2008). This definition sin- 
gles out those universals that are not predetermined by the historical origin ofthe 
structures in question, thus resembling Haspelmath's "functional-adaptive con- 
straints" on language (Haspelmath 2019 [this volume]). Universal forces of this 
kind produce structures that occur with "overwhelmingly greater than chance 
frequency" or ^well more than chance frequency" (Greenberg 1963: 62, 64, pas- 
sim), and they thus allow for exceptions. The number of such exceptions, in turn, 
is indicative of the strength of a universal force. 

Strong universal forces reveal themselves as universal on both of the method- 
ological approaches used in typology: on the static and on the dynamic approach 
(see Greenberg 1969 for these notions). The former crucially relies on the rela- 
tive frequency in the synchronic distribution across languages, while the latter 
is based on the relative frequency of the relevant changes across languages from 
a proto-stage (sTAGE 0) into the synchronic stage (srAGE 1).! A typical charac- 
teristic of strong universals is that the dynamic and the static evidence for these 
universals converge. For example, the force that ALL LANGUAGES MUST HAVE VOW- 
ELS (Comrie 1989: 19) finds solid evidence for universality on the static approach, 
in the sense that one would hardly find a spoken language violating this uni- 
versal, i.e. a language without any vowels. The dynamic approach will equally 
show that, despite various language-specific processes such as vowel reduction 
strategies and even vowel loss, these never succeed to such an extent as to yield 
a language without any vowel, because no other universal or language-specific 
force may override this universal force in any type of language change. 

Another strong universal - albeit somewhat weaker than the former - con- 
cerns inflection: IF THERE IS ANY INFLECTION IN NOUNS, THERE IS ALSO SOME IN- 
FLECTION IN PRONOUNS (Moravcsik 1993; Plank et al. 2002ff.). A still weaker uni- 
versal - a number of exceptions can be found in the literature (cf. Handschuh 
2014) - concerns case marking: IN A LANGUAGE WITH CASE, THE ZERO-MARKED 
CASE TENDS TO BE THE ONE THAT MARKS THE SUBJECT OF INTRANSITIVE VERBS 
(Greenberg 1963: 95). 


!Note that the static approach, too, assumes that the synchronic distributions are the result of 
diachronic changes that have led to them (cf. Haspelmath 2019 [this volume ]). It is, therefore, 
only methodologically but not ideologically synchronic. 


150 


7 Weak universal forces: The discriminatory function of case in DOM systems 


Thus, there is gradience in the strengths of universals and, accordingly, in the 
number of exceptions found at srAGE 1 with each universal. By entertaining the 
idea of gradience a bit further, one may also think of a force that systematically 
operates in the development of a particular category across languages, i.e. in the 
transition between STAGE 0 and STAGE 1, and is, therefore, a universal according 
to the definition in (1) above. However, this universal is not strong enough to 
override competing internal and/or universal forces to remain visible at STAGE 1. 
A universal of this kind is referred to as weak universal force: 


(2) Definition of a weak universal force 
A weak universal is a force that systematically exerts an impact in the his- 
torical development from STAGE 0 into STAGE lin a particular (grammatical) 
domain; this impact is found across geographic areas and genealogical af- 
filiations in the diachrony with significant frequency, but may be marginal 
and heavily restricted or not be visible at all in the synchronic layer (STAGE 


1). 


The synchronic effects of a weak universal force often reside in marginal sub- 
domains or are overridden altogether by some other, stronger processes (cf. Bickel 
2014: 117). This, in turn, means that there will be a significant number of viola- 
tions and only a weak positive statistical signal (if at all). As a result, the standard 
methodologies that rely on the relative frequency in the prima-facie data will pro- 
vide disproof of universality. 

To give an example, Hammarstróm (2015) argues on the basis of 5,230 lan- 
guages that there is a universal trend for SVO word order across languages (cf. 
Gell-Mann & Ruhlen 2011; Maurits & Griffiths 2014), henceforth, the SVO uNI- 
VERSAL. Having said this, he claims that "the universal is not the only, nor the 
most important factor" constraining the synchronic distribution; the most impor- 
tant factor responsible for the current distribution is the order of the immediate 
ancestor, i.e. inheritance. The following figures illustrate this point: SOV is much 
more widespread than SVO across language families, with 65.1% SOV vs. 16.2% 
SVO? but a change from SOV to SVO and from VSO to SVO is significantly more 
probable than the respective reverse changes (Croft 2003: 234; Maurits & Grif- 
fiths 2014). Hammarstróm (2015) shows that the pressure to retain the inherited 
word order accounts for 78% of the sample, while the universal SVO accounts for 
only 14% of the static evidence. The SVO UNIVERSAL is thus a weak universal in 


?SOV (43.3%) is attested only slightly more frequently than SVO (40.2%) if the genealogical bias 
is not controlled for (cf. Dryer 2013). This effect is just due to a few large families with SVO 
(Hammarstróm 2015). 
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the sense that it cannot so easily force a language to change into SVO against the 
pressure of inheritance. 

In what follows, I argue that the discriminatory function of flagging is a weak 
universal despite apparent counterevidence. I illustrate this with qualitative data 
and arguments about how different motivations may lead to a result that is eas- 
ily misinterpreted if taken at face value. In order to do so, I first introduce the 
(global) discriminatory function and the related phenomenon of local disambigua- 
tion (82). 83 exemplifies various differential object marking (DOM) systems and 
how the discriminatory function interacts with other, stronger forces in each of 
them. Finally, $4 provides a discussion of the phenomenon of weak universals 
and conclusions. 


2 The (global) discriminatory function 


Since a transitive clause has two arguments (A and P), it must be ensured that the 
hearer will be able to discern which of the arguments should be interpreted as 
A and P, respectively. Moreover, other potential misinterpretations, such as one 
NP modifying the other NP - if both are adjacent to each other — or both NPs 
being coordinated (without a conjunction), should be excluded. There are many 
ways in which the discriminatory function may be implemented in a particular 
language or even in a particular sentence, with flagging being one of them: 


(3) Definition of the global discriminatory function of P flagging (economy 
subsumed) 
In a transitive clause, the A and the P argument must be sufficiently dis- 
ambiguated, e.g. by word order, agreement, voice, world knowledge, and 
it is only if they are not that there is dedicated P flagging. 


A number of researchers have argued that there is only little or no evidence 
for (A or P) flagging systems being driven by the discriminatory function as de- 
fined in (3) cross-linguistically (inter alia, Aissen 2003; Malchukov 2008; various 
papers in de Hoop & de Swart 2009). Levshina (2018) shows on the basis of the 
large-scale AUTOTYP database that there is no statistically significant effect of 
the discriminatory function observable for flagging because there are only very 
few languages in which flagging is primarily driven by the discriminatory func- 
tion. Sometimes even in these languages, the discriminatory function does not 
serve the purpose of discrimination between A and P alone: a function inherited 
from the source construction and often some ongoing conventionalization of the 
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most frequent discrimination patterns override the discriminatory function to 
various extents. 

Having said this, it has been repeatedly suggested that flagging might also 
serve the discriminatory function, especially if A and P have similarly ranked 
input (cf., inter alia, Comrie 1978; 1989; Dixon 1994; Silverstein 1976; Kibrik 1997). 
Bossong (1985: 117) even assumed that the emergence of DOM is primarily due to 
the discriminatory function. In the following section, I follow this line of think- 
ing and provide qualitative evidence for the claim that the discriminatory func- 
tion does operate across genealogically and areally diverse DOM systems and 
is therefore a universal according to the definition given in (1). However, it is 
not a typical universal in that its impact is mostly weakened by other competing 
processes to which it is subordinate, the effect being that there is only marginal 
evidence for it at the synchronic STAGE 1. 


3 Evidence from DOM systems 


Consider the DOM system of the rural variety of Donno So, as described in Culy 
(1995). The DOM suffix -ñ marks human and often animal-denoting pronouns 
and nouns if the latter are definite: 


(4 Donno So (Dogon: Mali; Culy 1995: 48) 
Anta-ü ibera yaw aa bem. 
Anta-DoM market.Loc yesterday see.PTCP AUX.1SG 


'I saw Anta at the market yesterday: 


(5) Donno So (Dogon: Mali; Culy 1995: 48) 
Jalombe izombe-fi keraa biyaa. 
donkey.DEF.PL dog.DEF.PL-DOM bite.PTCP AUX.3PL 


"Ihe donkeys bit the dogs: 


In contrast, neither indefinite animates nor inanimate definites are marked. We 
observe that at least two referential scales are simultaneously operating here:? 


“In this paper, I do not make any assumptions about the nature of referential scale effects: 
whether they stem from generalizing the most frequent patterns conditioned by the discrimi- 
natory function (cf. Aissen 2003) or from the source (i.e. from topics, cf. Dalrymple & Nikolaeva 
2011), or are language-specific (Bickel et al. 2015), or whether they represent an independent 
phenomenon sui generis, is irrelevant here. 
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(6) Animacy scale 
human > animate > inanimate 


(7) Definiteness scale 
definite »specific » indefinite 


Superficially, the discriminatory function does not seem to apply in this lan- 
guage since scale effects from (6) and (7) predominate: all animate and definite 
NPs are marked regardless of whether they really need to be globally disam- 
biguated or not. However, the animate indefinite NPs that are not (yet) affected 
by the scale effects do show the operation of the discriminatory function. With 
these NP types, the DOM marker may be employed to discriminate between A 
and P in a particular utterance (cf. the Disambiguation Principle in Culy 1995: 
52). For example, when both the object and the subject NP are indefinite and 
animate and there are no other clues how to discriminate between A and P, the 
DOM marker may be employed "against" the force of marking definite animates 
only: 


(8) Donno So (Dogon: Mali; Culy 1995: 53) 
Wezewezegine yaana po-ñ don wo mo ni tembe. 
crazy.person woman large-pom place 3sc Ps at found 


“A crazy person found a large woman at his/her place: 


In this example, both indefinite NPs ‘a crazy person’ and ‘a large woman’ may 
potentially be interpreted as A (Culy 1995: 53). Therefore, the DOM marker -ñ 
is used here to unequivocally mark the syntactic role of ‘a large woman’. The 
discriminatory function is the weakest among other forces here (Culy 1995: 53) 
because it applies in a way exceptionally by constraining only one slot on the 
referential scales in (6) and (7): the indefinite animate P. In accordance with Culy 
(1995: 51), one can thus posit the following forces and their relative weight (from 
the strongest to the weakest): 


(9) The relative weight of the main forces on DOM in Donno So (and 
Malayalam, see below) 
animacy scale + definiteness scale > discriminatory function 


Another important observation can be made here. Notice that the slot on the 
referential scales in (6) and (7) that is open for the application of the discrimina- 
tory function is immediately next to the slots that require rigid marking. I inter- 
pret this in the following way. In their historical developments, many DOM sys- 
tems extend the DOM markers gradually from left to right on referential scales 
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such as (6) and (7) (cf. Dalrymple & Nikolaeva 2011). For example, many languages 
start with a DOM system that applies only to animate nouns but then gradually 
extend the DOM marker onto inanimate nouns as well. Note that very often the 
difference in meanings between the two neighbouring slots on a referential scale 
is quite substantial and is certainly not graspable in terms of semantic extension. 
For example, the expansion of the DOM marker -ra(y) from mostly animates in 
Middle Persian (Key 2008: 244; cf. also Paul 2008: 152-153) to the inclusion of 
inanimates in Modern Persian is not semantically straightforward, since the two 
are rather antonymic in meaning. I suggest that it is precisely the discriminatory 
function that is responsible for the expansion of the DOM marker into the next 
slot on the scale because the discriminatory function is not dependent on the lex- 
ical meaning of the noun in the same way as, for example, the animacy scale. The 
discriminatory function then applies to the next slot until that slot also becomes 
conventionalized, and so on. 

A constellation very similar to Donno So is found in Malayalam (Dravidian). 
The Accusative marker -(y)e is regularly used with animate specific object refer- 
ents but is normally ungrammatical with inanimate referents: 


(10) Malayalam (Dravidian: India; Asher & Kumari 1997: 204) 
Tiiyyo kutil ^ nafippacu. 
fire. NoM hut.NoM destroy.PST 
‘Fire destroyed the hut: 


However, in one special case, it may be used on inanimate referents as well, i.e. 
precisely when there is no other way to (globally) discriminate P from A (Asher 
& Kumari 1997: 204, cf. Stiebels 2002: 16; Subbäräo 2012: 174-176): 


(11) Malayalam (Dravidian: India; Asher & Kumari 1997: 204) 
a. Kappal tiramaalakal-e bheediccu. 
ship.NOM wave.PL-ACC(-DOM) split.PST 
"Ihe ship broke through the waves. 


b. Tiramaalakal kappal-ine bheediccu. 
wave.NOM.PL ship-Acc(-Dow) split.PsT 


"Ihe waves split the ship: 


As in Donno So above, the discriminatory function becomes visible only in 
those slots on the referential scales that are not (yet) affected by the scale effects. 


155 


Ilja A. SerZant 


While in Donno So the indefinite animate slot became available for the discrim- 
inatory function, it is the inanimate slot (both definite and indefinite) in Malay- 
alam. The relative weight of the discriminatory function of the DOM marker in 
Malayalam is lower than the effect of the referential scales, cf. (9) again. 
Crucially, if one were to superficially evaluate whether or not the discrimina- 
tory function operates in Donno So or Malayalam, one would have to conclude 
that it does not, because of the rigid marking of animates (definite animates in 
Donno So) and the rigid zero with inanimates. Thus, from the perspective of 
the discriminatory function, utterances like (4) redundantly mark their objects; 
conversely, examples such as (10) are economical but equally violate the discrim- 
inatory function since same-rank A and P are not disambiguated. I summarize: 


(12) The relative weight of the main forces in DOM in Donno So and 
Malayalam 
animacy scale + definiteness scale > economy > discriminatory function 


Catalan is another example of this pattern. Here, the DOM marker a is obliga- 
tory only for strong (non-clitic) personal, relative and reciprocal pronouns in the 
non-colloquial register (cf. Escandell-Vidal 2009). Thus, the DOM marker of Cata- 
lan is primarily conditioned by the parts-of-speech scale: Pronouns are marked 
while other NPs are unmarked: 


(13) Parts-of-speech scale 
(independent) pronouns » nouns 


However, the DOM marker may exceptionally appear also with definite animate 
NPs in the contexts of subject-object ambiguity (Wheeler et al. 1999: 243): 


(14) Catalan (Romance: Spain; Wheeler et al. 1999: 243) 
T'estima coma la seva mare. 
2sc.0BJ-love.PRs.3sc like DOM DEF.F 3SG.F.Poss mother 


“She loves you like (she loves) her mother! 


Again, the discriminatory function is subordinate to the parts-of-speech scale 
(13). It may only exceptionally violate the cut-off point between pronouns and 
nouns on this scale that is otherwise rigid in this language. Additionally, the ani- 
macy scale (6) and definiteness scale (7) apply in that they determine the NP type 
for which the discriminatory function may operate: the discriminatory function 
can only operate on definite animates but not on inanimates or indefinites in this 
language. I summarize: 
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(15) The relative weight of the main forces in DOM in Catalan 
parts-of-speech scale > animacy + definiteness scale > discriminatory 
function 


The situation in Spanish is somewhat different but largely analogical. Animate 
and specific NPs must be marked while inanimate and/or non-specific NPs must 
remain unmarked. However, the DOM marker a is obligatory in certain contexts 
of disambiguation, even with inanimate NPs: 


(16) Spanish (Romance: Spain; von Heusinger & Kaiser 2007: 89) 
En esta receta,la leche puede sustituir a-1 huevo. 
in DEM recipe DET milk can replace DOM=DET egg 


‘In this recipe, egg can replace the milk’ 


We observe the same constellation here: the discriminatory function is subordi- 
nate to the effects of referential scales. 

Another example is the DOM marker -án in Hup (Nadahup). It is obligatory 
with definite animates (including pronouns) as well as with the plural collective 
marker =d’ah (Epps 2008: 170-177). At the same time, the DOM marker -án may 
be used with indefinite animates to discriminate the P argument from A (Epps 
2009: 95). Consider the following example, in which the A argument is left out 
because it is non-referential: 


(17) Hup (Nadahup: Brazil/Columbia; Epps 2009: 95) 
Hüp-án Io way, hüp-án dóh-óy. 
person-DoM scold-pyN person-DOM curse-DYN 


“(Some people) scold people, cast curses on people’ 


The P argument is not referential either, let alone definite. Since it is indefi- 
nite, it should not be marked. However, in order to discriminate the P argument 
from a possible misinterpretation as A, the object marker is used here (Epps 2009: 
95). Again, the discriminatory function is weak because it is subordinate to the 
referential-scale effects which primarily determine the slots in which the discrim- 
inatory function may apply (e.g. on inanimates or indefinites or non-referential 
NPs, etc.). The relative weight of these is the same as in Catalan in (15) above. 

The subordinate discriminatory function is found in other Nahadup languages 
as well. For example, the object marker -i:y? in Dáw accompanies topical objects 
but it may also be used for the discriminatory function (Martins & Martins 1999: 
263-264). 
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Similarly, the Papuan language Awtuw obligatorily marks all pronominal and 
proper-name direct objects regardless of whether there is a need for discrimina- 
tion or not: 


(18) Awtuw (Sepik: Papua New Guinea; Feldman 1986: 109) 
"Wanrey  du-k-puy-ey. 
Iso  3M.sG FA-IPFV-hit-IPFV 


[Intended meanings] Tm hitting him. / “He's hitting me? 


In addition, overt definiteness - marked either by a demonstrative or a pos- 
sessor NP - has the tendency to attract object marking regardless of the context 
(Feldman 1986: 109—110). By contrast, the marking of common nouns is optional. 
It becomes obligatory in case of ambiguity, or else the NPs will be interpreted as 
conjoined (Feldman 1986: 110): 


(19) Awtuw (Sepik: Papua New Guinea; Feldman 1986: 109) 

a. Piyren-re yaw di-k-zel-iy. 
dog-DoM pig FA-IPFV-bite-IPFV 
"Ihe pig is biting the dog’ 

b. Piyren yaw di-k-zl-iy. 
dog pig FA-IPFV-bite-IPFV 
"Ihe dog and the pig bite! / “The pig is biting the dog’ / “The dog is 
biting the pig’ 


The situation in Awtuw is slightly different from the one found in the lan- 
guages above: the slot affected by the discriminatory function (common nouns) 
already allows for the overt marking; the discriminatory function turns the mark- 
ing in a particular utterance from optional into obligatory for this particular in- 
terpretation. 

The prepositional DOM marker bd of Chinese primarily occurs before animate, 
definite or, rarely, indefinite specific preverbal object NPs while postverbal ob- 
jects are never marked with it (Li & Thompson 1981; Bisang 1992: 158-159; Yang 
& van Bergen 2007): 


(20) Chinese (Sinitic: China; Li & Thompson 1981: 464) 
Ta bá fànting shoushi ganjing le. 
3sG DOM dining.room tidy.up clean PFV 


'S/He tidied up the dining room! 
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The discriminatory function as defined in (3) above is not relevant in (20). In 
addition to the general SVO and S bá OV word orders, Chinese also allows for 
OSV with topical objects and prominent subjects, cf. (21): 


(21) Chinese (Sinitic: China; Bisang 1992: 158) 
a. Láng Mary chi-le. 
wolf Mary eat-prv 
‘Mary ate the wolf? 
b. Láng bá Mary chi-le. 
wolf Dom Mary eat-Prv 
"Ihe wolf ate Mary! 


To force the interpretation of (21) with SOV, the bá marker has to be used in or- 
der to disambiguate the referentially more prominent NP (Mary) as P (cf. Bisang 
1992: 158). Again as in the examples above, the DOM system of Chinese is primar- 
ily driven by the cut-off points on referential scales (definiteness, animacy) and 
some other strong rules pertaining to affectedness, aspectuality and the “dispos- 
ability" of the object referent (cf. Li & Thompson 1981). Some of these functions 
are most probably inherited from the source, such as the requirement on dis- 
posability or the preverbal position, which may be explained as the retention of 
the properties of the source construction.“ The discriminatory function is thus 
again limited to a particular constellation of (21) in which the source function, 
referential scale effects and other forces allow it to operate. 

The discriminatory function in Mam (Mayan) is carried out by the obligatory 
cross-referencing of both A and P on the verb; no flagging is involved. By con- 
trast, the Antipassive form of the verb does not allow for cross-referencing the 
P argument, which is regularly marked by the preposition / relational noun -i7j 
‘about’ or -ee (dative, beneficiary) (England 1983: 212): 


(22 Mam (Mayan: Guatemala; England 1983: 213) 
ma o-tzyuu-n Cheep *(t-i?j) ^ xiinaq 
REC 3A-grab-ANTIP Jose *(3sG-RN) man 
‘Jose grabbed the man! 
However, “if there is no confusion as to which noun phrase is the agent and 


which is the patient” the relational noun may be omitted in order to code the 
meaning of an unintentional act (England 1983: 212): 


‘The bá marker stems from the lexical verb ‘to hold’ in a serial verb construction (Sun 1996: 
61-62). 
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(23) Mam (Mayan: Guatemala; England 1983: 212—213) 
a. Ma o-tzyuu-n Cheep t-i?j ` ch 
REC 3A-grab-ANTIP Jose 3SG-RN bird 
‘Jose grabbed the bird: 
b. Ma g-tzyuu-n Cheep ch'it. 
REC 3A-grab-ANTIP Jose bird 


‘Jose unintentionally grabbed the bird: 


The discriminatory function thus delimits the range of the input with which unin- 
tentional acts can be expressed (in the Antipassive). In other words, the discrim- 
inatory function of flagging is found in a very small subdomain of the language, 
i.e. in the unintentional use of the Antipassive. 

A somewhat different constellation is found in Tamasheq (Berber). The marker 
na (na, na depending on the dialect and tone sandhi) occurs only in SOV word 
order - never in SVO or VSO - and only if there is no verb inflection (Perfective 
Indicative), ie. when no disambiguation via indexing is possible (Heath 2007: 
92, 94). Moreover, both arguments must be expressed overtly. For example, the 
marker cannot be used in the imperative with the subject dropped (Heath 2007: 
92-93). These requirements suggest that the marker is conditioned by the dis- 
criminatory function: 


(24) Tamasheq (Afro-Asiatic, Berber: North Africa; Heath 2007: 91; glosses 
adapted) 
Hàr-óó na háns-óó  kárü. 
man-DET.SG DOM dog-DET.sG hit 
"Ihe man hit the dog’ 


Without nd, both NPs may be misinterpreted as either a compound or as a pos- 
sessor phrase 'the man's dog' (Heath 2007: 91). 

Moreover, some Mande languages such as Soninke, Bambara, Wan or Songhay 
languages of the area also have similar markers that primarily fulfil the discrim- 
inatory function of unambiguous identification of the subject and the object in 
a clause (Heath 2007; Creissels & Diagne 2013; Nikitina 2018). While Tamasheq, 


“It is referred to as a “bidirectional case marker” in Heath (2007) as well as in the descriptions of 
some Mande languages, cf. Diagana (1995), Nikitina (2018). Bidirectional case markers cannot 
be straightforwardly related to either A or P marking since they occur only when both are 
present and do not show any phonetic or syntactic fusion effects. Note that bidirectional case 
markers are treated under the heading of differential argument marking, cf. Nikitina (2018). 
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similarly to many Central Mande languages, has generalized the marker, extend- 
ing it onto all SOV utterances, Wan (South-eastern Mande) employs the marker 
laa predominantly only in those input configurations which are in need of dis- 
ambiguation given SOV. The marker is used with nominal A and pronominal 
P (62%) but not with pronominal A and nominal P (0%) (Nikitina 2018: 202). In 
contrast to the languages discussed above, in these languages the discriminatory 
function is somewhat stronger, as it applies across the board under SOV. Ana- 
logically, the DOM marker is optional in the most frequent SOV word order in 
Korean but becomes almost obligatorily when the object is preposed (OSV) (Ahn 
& Cho 2007). 

At least two Loloish languages (Tibeto-Burman) also attest a strong discrim- 
inatory function that is not subordinate to some other force. The direct-object 
markers t^a? in Lahu and t”ie in Lolo are only used if the context does not help 
to discriminate between A and P. That is, these markers code direct objects only 
where the inherent semantics of the participants (such as animacy) and the se- 
mantics of the event fail to do so: 


(25) Yongren Lolo (Tibeto-Burman, Loloish: China; adapted from Gerner 2008: 
299°) 
no gemo tie Is Zi. 
1sc snake pow follow go 


‘I will follow the snake 


(26) Yongren Lolo (Tibeto-Burman, Loloish: China; adapted from Gerner 2008: 
300) 
Sika tie xek'u ti na. 
tree DOM house smash broken 


"Ihe house smashed the tree. 


The absence of the Accusative marker would not be ungrammatical but would 
create ambiguity as to who is following whom in (25) or what is smashing what 
in (26) (Matisoff 1973: 156; Gerner 2008). However, along with the synchronically 
primary function of discriminating P from A (and also R from A), this marker 
also has the diachronically primary function of coding contrastive focus (Gerner 
2008: 298-289). For example, (27a) cannot be used with the DOM marker t"ie 
because of the lack of a focal contrast. By contrast, (27b) is acceptable with it if 
the numeral is interpreted as bearing contrastive focus (Gerner 2008: 299): 


“I simplified the transliteration and slightly adjusted the glossing of all examples from Gerner 
(2008). 
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(27) Yongren Lolo (Tibeto-Burman, Loloish: China; adapted from Gerner 2008: 
299) 


a. Bolu molu tsi o. 
Bolu trousers wash PRF 


‘Bolu washed trousers: 


b. Bolu məlu so kho thie tsi 9. 
Bolu trousers NUM.3 CLF DOM wash PRF 


‘Bolu washed THREE pairs of trousers [not just TWO]? 


Importantly, (27b) may at first glance be interpreted as counterevidence to the 
discriminatory function because A and P are sufficiently disambiguated by the 
lexical meanings anyway. Hence, the marking is not due to the discriminatory 
function. I claim that this is not a piece of counterevidence for the hypothesis 
ofa weak discriminatory function. It may only count as counterevidence for the 
strong hypothesis of the discriminatory function being the only force constrain- 
ing DOM (which is counter-intuitive anyway). The source function of marking 
contrastive focus overrides the discriminatory function here. A situation where 
various new and inherited functions cluster on one marker is typical of many 
grammatical categories (cf. Hopper 1991: 22). For example, if an indefinite article 
does not mark plural indefinite NPs but only singular ones, this cannot be taken 
as counterevidence for its being an indefinite article. A more plausible account 
is that the restriction to the singular is just the impact of the source meaning. 

Another similar DOM system is the one of Khwe. In this language, proper 
names must obligatorily be marked with a/-a; additionally, this marker encodes 
contrast and/or focus on the NP (Kilian-Hatz 2006: 82-83). At the same time, the 
marker may also be used in contexts in which the distinction between subject 
and object would have been impeded, for example, when both arguments are 
animate and topical (Kilian-Hatz 2006: 82-83): 


(28) Khwe (Khoe: Southern Africa; adapted from Kilian-Hatz 2006: 83) 
a. Tcá tí à kx óaàa. 
2SG.M 1SG DOM wait 
"You have to wait for me!’ 
b. Yàá! Cáó à ti kyá-rá-hà! 
yes 2DU.F DOM 15G love-ACT-PST 


“That's it! I love you two (women)! 
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Further examples may be added. For example, the DOM marker -m in Imonda 
(Papuan) is used obligatorily with some verbs such as eg to follow” or hetha 
‘to hit’ as well as with others to denote something like resultativity (“direction- 
ality”) of the action (Seiler 1985: 163). However, in addition to that, the marker 
may also serve to disambiguate Ps from As when both have similar-rank input 
(Seiler 1985: 165). Furthermore, DOM in Guaraní is primarily conditioned by ani- 
macy, definiteness and topicality but it may also marginally fulfil the discrimina- 
tory function (Shain 2009: 89-92). In Telkepe (Semitic, Aramaic), the new object 
marker ta may be employed in those situations where agreement alone does not 
provide for disambiguation while it is otherwise heavily constrained by its mean- 
ing of marking topics (Coghill 2014: 354). Finally, Kurumada & Jaeger (2015) show 
for Japanese that, in addition to animacy, disambiguation also triggers the DOM 
marker -o (see also Fedzechkina et al. 2012). 

The discriminatory function may help to explain the world-wide distribution 
of DOM, namely, why there are more animacy-driven DOM systems than those 
driven by definiteness and/or specificity across languages. Thus, in a large-scale 
typological study by Sinnemäki (2014: 295), roughly 39% of DOM systems are con- 
ditioned by animacy, while DOM systems conditioned by definiteness/specificity 
are areally biased towards the Old World and occur less frequently (34% of his 
sample). I claim that the reason for this is that animate referents are much more 
strongly associated with the A role than definite/specific referents. Hence, there 
is a more urgent need with animate than with definite referents for the discrim- 
inatory function to apply. A number of corpus studies from various languages 
show that only animacy shows reversed association tendencies with A and P 
such that As tend to be animate while Ps tend to be inanimate; by contrast, both 
As and Ps - with minor differences - tend to be definite and/or specific (Dahl 
2000; Hofling 2003; Everett 2009; Fauconnier & Verstraete 2014). 

Finally, there is neurolinguistic evidence for the discriminatory function, sug- 
gesting that A and P are not treated symmetrically by the processor. Instead, 
Bornkessel-Schlesewsky & Schlesewsky (2015) claim that the effects they observe 
cannot be explained by simply arising from the degree of semantic associations 
for the A or P role. Rather, both arguments are interpreted relatively to each other 
(Bornkessel-Schlesewsky & Schlesewsky 2015: 336). Analogically, Kurumada & 
Jaeger (2015) found in their psycholinguistic study on DOM in Japanese that just 
the properties of the arguments are insufficient to explain the results of their 
experiments and that the case-marking is affected by the plausibility of role as- 
signment given both arguments and the verb (2015: 161; cf. also Ahn & Cho 2007; 
Fedzechkina et al. 2012). 
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Above, I have argued that the global discriminatory function as defined in (3) 
is found to operate in many diverse languages. Moreover, I have found that it 
is most frequently the weakest force alongside other forces, such as referential 
scale effects (based on animacy, definiteness or parts of speech) or the source 
meaning (focus, topic, etc.). All these forces constrain the DOM systems at the 
same time. The weakness of the discriminatory function is not correlated, I claim, 
with scarce attestation across languages. On the contrary, I suspect that its impact 
could be found across most of the DOM systems if one took a closer look at the 
historical developments and if the synchronic descriptions were more detailed. 

The context-dependent, global discriminatory function in (3) is relatively ex- 
pensive because it requires whole-utterance planning and online decision mak- 
ing on the part of the speaker. It is costly for the hearer as well since ambiguous 
NPs (e.g. German die Frau “DET.NOM=ACC woman’) - if placed clause-initially - 
can only be interpreted by the hearer once enough context has been provided, 
and not incrementally (Bornkessel-Schlesewsky & Schlesewsky 2014: 107). It is 
perhaps for this reason that the global discriminatory function often develops 
into what may be called a local discriminatory function (cf. Aissen 2003; Zeevat 
& Jáger 2002; Jager 2004; Malchukov 2008: 208, 213). By virtue ofthe local discrim- 
inatory function, the NP is disambiguated as A or P immediately and regardless 
of whether the whole utterance might make disambiguation redundant. The lo- 
cal discriminatory function is more efficient because it allows for more reliable 
incremental processing of the utterance. The degree of efficiency and processabil- 
ity, in turn, correlates with the strength of a force (Hawkins 2014: 60, 69). This 
is why the global discriminatory function (cf. (3)) is a weak force and its effects 
tend to be generalized over diachronically, for example, by conventionalizing the 
flagging on those NP types that tend to be disambiguated most frequently or, al- 
ternatively, by conventionalizing the marker in those constructions that require 
disambiguation most frequently (such as SOV in Tamasheq). 

A number of languages have undergone this change towards local disambigua- 
tion. I illustrate this with the development of DOM in Russian. I base my argu- 
mentation on the philologically profound evidence from Krys'ko (1994; 1997). 

Old Russian inherited from Proto-Slavic the emergent DOM system that evol- 
ved in the following way. The direct object was marked by the Accusative case 
in affirmative clauses and by the Genitive case in clauses with predicate nega- 
tion. Already during the Proto-Slavic period, the Genitive started penetrating 
into affirmative transitive clauses (Klenin 1983). The reason is that, under predi- 
cate negation, the Genitive no longer carried any functional load but became just 
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a purely syntactically conditioned rule.” The Genitive was thus just another way 
of marking direct objects (when the predicate was negated) alongside Accusative. 
At the same time, due to the overall loss of all word final consonants, the old Ac- 
cusative and Nominative markers became phonetically indistinguishable in the 
singular in most of the Proto-Slavic declensions and, subsequently, turned into 
Zero: 


Table 1: Phonetically driven conflation of the old Accusative with the 
old Nominative in most of the declensions (cf. Arumaa 1985: 130) 


Proto-Slavic Proto-Slavic Resulting form 
Nominative Accusative Accusative = Nominative 
u-declension ` "us *-um > *-U > -0 > Ø 
i-declension *-is *-im > *-i > -b > Ø 
o-declension *-os > *-us *-om > *-um > *-U > -b0 > Ø 
A -d 1 . *: X ee 4 * i * o: J j 
jo-declension “jos > -jus ‘jom > -jum > *-ju > *-jo > Jv > Jø 


The new DOM marker - i.e. the Genitive case - replaced the old (zero) Ac- 
cusative only on animate nouns and some pronouns. Importantly, only those 
animate nouns and pronouns were affected which belonged to the declension 
classes that did not differentiate between the Nominative and the Accusative 
anymore (cf. Table 1). Thus, the expansion of the new DOM marker (Genitive) 
was crucially conditioned by the local discriminatory function alongside the an- 
imacy scale (Krys'ko 1994). 

The evidence for this is abundant: (i) The Genitive did not replace the old Ac- 
cusative in the a-declension because, in this declension, the old Accusative (-9 
> -u) had not become indistinguishable from the Nominative (a) due to nasal- 
ization of the former. (ii) The first NP types affected were proper names while 
personal pronouns generally remained unaffected to begin with, which is atypi- 
cal of DOM systems that tend to expand along the referential scales. The reason 
for this is that personal pronouns had not undergone the phonetic conflation 
of the Nominative (cf. azv '1sc.NoM") and the Accusative (cf. me ‘Isc.acc’) and 


"Originally it had an emphatic function similarly to double negation in, for example, French, cf. 
Kurylowicz (1971). 

"There are no unambiguous Genitive forms of pronouns in the position of a direct object in 
Early Slavic (Meillet 1897: 84, 97; Vondrák 1898: 327; Krys'ko 1994: 128). Following Meillet, 
Kurylowicz (1962: 251) concludes that chronologically, the Accusative-from-Genitive with per- 
sonal pronouns must be later than with animate masculine nouns. 
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hence were not in need of disambiguation. (iii) The plural of the o-declension - 
in contrast to the singular - did retain the phonetic distinction between the old 
Accusative (-y) and the old Nominative (-i) and thus the old Accusative was not 
replaced by the new DOM marker here. Only later, between the 14^ and 16% 
c., were both the old Accusative plural and the old Nominative plural conflated 
into -y. Precisely from this period onwards, the new DOM marker (Genitive plu- 
ral) started to be used instead of the Accusative in the plural (Krys'ko 1994: 144). 
(iv) The third person pronoun j- did not have a Nominative form in Early Slavic 
(various demonstratives were used instead here). Hence, there was no need for 
disambiguation; Although the form ji itself would have been morphologically 
ambiguous between the Nominative and Accusative, it was reserved for the Ac- 
cusative only. This pronoun acquired the new DOM marker much later than the 
relative pronoun ji-Ze (both are etymologically related). Since the relative pro- 
noun ji-Ze did have both the Nominative and the homophonous Accusative forms, 
it acquired DOM very early. (v) Finally, as Krys'ko (1993) shows, the conflation 
of the old Nominative with the old Accusative took place much later in the Old 
Novgorodian dialect, because the latter retained the dedicated Nominative form 
-e in the o-declension, as opposed to the old Accusative (-» > ø). The erstwhile 
retention of the dedicated Nominative affix guaranteed the distinction between 
A and P and hence no DOM was needed until the Nominative affix disappeared 
in this dialect, too. 

In all instances in which either the Accusative or the Nominative was not zero 
or the Nominative did not exist at all, the new DOM marker was introduced much 
later or not at all. It was precisely the Nominative-Accusative syncretism, i.e. the 
indistinguishability of A and P, that triggered the introduction of the new DOM 
marker. This relative chronology of the expansion of the Genitive to different 
NP types suggests that the discriminatory function was the crucial trigger con- 
ditioning it (first in Dobrovsky 1834: 39; Krys'ko 1994: 156; Tomson 1908; 1909). 
Although there is no direct evidence for the global discriminatory function as in 
(3), the consistent application of local disambiguation in different nominal and 
pronominal classes might suggest that there was a development from global to 
local disambiguation by means of conventionalization. 

The domain of the discriminatory function was determined by a language- 
specific phonological process, namely, the loss of word-final consonants: Only 
those declensions were affected which had undergone the phonetic conflation 
of the old Nominative and Accusative. I conclude that the following forces were 
crucial in the development of Russian DOM (alongside some others such as ana- 


logical levelling): 
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(29) ‘The relative weight of the main forces in the development of DOM in 
Russian 
complete loss of word-final consonants > discriminatory function > 
animacy scale 


It is clear that the complete loss of word-final consonants was a stronger force in 
Proto-Slavic than the discriminatory function because otherwise the latter would 
have blocked the former. Crucially, the resulting synchronic picture — if looked 
at superficially - clearly violates the animacy scale and the global discriminatory 
function as in (3). While some declensions distinguish between animate and inan- 
imate nouns by means of the new DOM marker, other declensions do not have 
this distinction and mark animate and inanimate Ps indistinguishably. 


4 Discussion and conclusions 


In this paper, I have taken a dynamic perspective on the development of DOM 
systems. I have provided qualitative evidence from a number of areally and ge- 
nealogically unrelated languages for the claim that the discriminatory function 
of case keeps appearing in the diachrony of DOM systems in various subdomains 
and/or leaves behind traces in the form of local disambiguation. Importantly, the 
discriminatory function is not dependent on the respective historical source of 
the DOM marker and its particular developmental path. It is only the range of its 
application in a particular DOM system that is indeed very much constrained by 
the source meaning of the marker and/or by scale effects. Even scale effects them- 
selves are sometimes just a strong residual of the source meaning of the DOM 
marker. For example, DOM markers of many languages (Persian, Romance, Ka- 
nuri, etc.) stem from topic markers (cf. Iemmolo 2010; Dalrymple & Nikolaeva 
2011; see also Cristofaro 2019 [this volume]). In other instances, the scales are 
epiphenomenal, as they represent conventionalizations of the most frequent pat- 
terns originally conditioned by the discriminatory function (e.g. in Russian). 

Thus, the discriminatory function is frequently subordinate to other, stronger 
pressures, foremost the source meaning of the relevant marker. In addition, pres- 
sures like paradigmatic levelling (cf. Jáger 2007: 102) or analogical extension play 
arole in individual systems. Even those DOM systems which are primarily condi- 
tioned by the discriminatory function synchronically (such as the one of Yongren 
Lolo) never have the discriminatory function as the only constraint. I conclude 
that — even though recurrent from language to language in the transition - the 
discriminatory function is not strong enough to resist competition with other 
forces. 
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But what conditions the power of the discriminatory function in a particu- 
lar DOM system? The degree to which the discriminatory function is found to 
operate synchronically in a particular DOM system or subsystem sometimes cor- 
relates positively with how recent the DOM (sub)system is in the language. Thus, 
the evidence for the discriminatory function is most clearly found in those DOM 
(sub)systems that emerged relatively recently. For example, the use of the marker 
laa in Wan (South-eastern Mande) to discriminate between A and P is a very re- 
cent phenomenon, while its original function was one of marking the focus and 
the focused agent in a perfect-passive-like construction (Nikitina 2018). In Wan, 
it is the whole system of differential marking that is recent (Nikitina 2018). In 
Spanish, only the subsystem of definite inanimate NPs, as in (16), has recently 
been affected by DOM (inanimates are not affected by DOM in Old Spanish). It 
is this slot where the discriminatory function is found to operate occasionally. 
But differential object marking as such is quite an old phenomenon in this lan- 
guage. 

By contrast, in older DOM systems, the effects of the discriminatory function 
tend to conventionalize to replace context-dependent rules that are much costlier 
in processing. The DOM marker is generalized in those contexts that were most 
frequently in need of global disambiguation. The generalization may proceed (i) 
along particular NP types or (ii) along particular constructions / word orders. 
For example, (i) Catalan generalized the DOM marker with personal pronouns 
regardless of whether there was a need for disambiguation or not in a particu- 
lar utterance. By contrast, (ii) many Mande languages, Songhay and Tamasheq 
(Berber) generalized the marker in the APV (SOV) word order with no auxil- 
iaries intervening between A and P in constructions requiring both overt A and 
P. These were precisely those contexts in which the distinction between A and 
P was particularly blurred. By contrast, the imperative does fulfil the discrimina- 
tory function, albeit in a different way: The sole NP that is expressed overtly is 
the P argument, while A is dropped. Hence, there was no need for a distinction 
between A and P by means of flagging here. 

There are other types of bivalent constructions, such as equative construc- 
tions or comparative constructions, which are also sometimes constrained by 
the discriminatory function in order for the hearer to coherently process them. 
Unfortunately, they have never been considered in the general discussion on the 
discriminatory function of flagging, probably because the conventionalization 
processes involved here do not proceed along the same scales as the prototypi- 
cal transitive constructions. However, this effect is certainly just due to different 
semantic expectations, e.g., as to the standard of comparison and the comparee 
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in the comparative construction, than in a prototypical transitive construction. 
Furthermore, the discriminatory function of flagging is found to apply in ditran- 
sitive constructions of some languages in order to distinguish between A and 
R, which have similar semantic entailments and thus often do not provide for 
sufficient cues for the correct interpretation themselves. 

In more general terms, I have argued for the existence of weak universals — 
a type of universal force that applies across different languages and language 
families but which is not strong enough to prevail into the synchronic STAGE 1. 
I claim that the (global) discriminatory function of flagging is a weak universal. 
This claim is supported by neurolinguistic and psycholinguistic evidence which 
suggests that both arguments are interpreted relatively to each other and cannot 
be reduced to the degree of semantic association of each argument with the role 
it bears (Bornkessel-Schlesewsky & Schlesewsky 2015: 336; Ahn & Cho 2007; 
Fedzechkina et al. 2012; Kurumada & Jaeger 2015). 

Its weakness is possibly motivated by a higher processing load (cf. Hawkins 
2014: 60, 69) as compared to local disambiguation: it requires pre-planning of the 
whole clause by the speaker and hinders incremental processing by the hearer. 
By contrast, local disambiguation is straightforward and may be processed incre- 
mentally without "having to wait" until sufficient context is provided (cf. Born- 
kessel-Schlesewsky & Schlesewsky 2014: 107). This is why patterns produced by 
the (global) discriminatory function often become conventionalized (cf. Aissen 
2003; Zeevat & Jáger 2002; Jáger 2004; Malchukov 2008: 208, 213). 

The concept of STRENGTH OF UNIVERSALS, in particular, of weak universals, is 
relatively new to linguistics (though see Bickel 2013 for some discussion). How- 
ever, it ties in with the insight that human cognition in general and language 
acquisition in particular are better characterized by probabilistic biases or con- 
straints ranging from weak to strong (cf. Thompson et al. 2016). Moreover, it 
seems that very strong (absolute) universals have a different motivation than 
weak universals. The former may indeed reflect some innate properties of hu- 
man beings, as suggested by nativists (cf. Chomsky 1965), though not necessar- 
ily domain-specific properties. For example, the universal that all languages must 
have vowels (Comrie 1989: 19) is a very strong, probably, absolute universal. It 
seems likely that it is caused by innate properties of the human articulatory (and 
auditory?) apparatus. By contrast, weak universals are rather motivated by cul- 
tural evolution, for example, by the strive towards efficient communication be- 
tween individuals (Haspelmath 2019 [this volume]). 

Weak universals constitute a number of challenges for typological research. 
While strong universals override all potentially competing pressures and can 
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thus be detected by relatively simple techniques, weak universals enter into com- 
petition with both other functional motivations as well as language-specific fac- 
tors, not least the source meaning of the relevant marker (cf. Cristofaro 2012, 
Cristofaro 2017, Hammarstróm 2015). The only way of modelling this adequately 
is a fine-grained competing-motivations account (cf. Haiman 1983; Du Bois 1985; 
Croft 2003: 59; Bickel 2014: 115; Hawkins 2014: 60, 69; pace Cristofaro 2019 [this 
volume])? For the same reason, weak universals also pose a methodological prob- 
lem for typological testing for universality, even on the dynamic approach that 
relies on the transition from STAGE 0 into STAGE 1. Dynamic methods based on 
transitional probabilities do take into account one of the competing motivations, 
namely, the impact of inheritance (transitional probabilities are measured given 
the original state, i.e. sTATE 0). However, many other factors that may influence 
the probability of change towards a particular pattern are glossed over on this 
approach as well. Finally, weak universals raise an important question about the 
nature of evidence in typology. Traditionally, typologists have been interested 
in defining what qualifies as positive evidence. Statistically significant signals 
that are due to the common genealogical or areal relationships of the languages 
of the sample have been ruled out as not offering positive evidence for univer- 
sality. Other types of signals that may not count as positive evidence, such as 
same-source constructions, have also been identified (cf. Cristofaro 2017; Collins 
2019 [this volume]). At the same time, a definition of what really counts as neg- 
ative evidence, i.e. the proof of absence, is missing. As was argued in this paper, 
a random distribution in the sample given coarse data mining methods without 
taking the dynamic and historical evidence into account, might not be sufficient. 
This is problematic because, intuitively, it seems probable that strong universals 
are only the tip of the iceberg, not being numerous, while many more universals 
are rather weak universals of the type investigated here. 
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?In contrast to optimality-theoretic approaches that also primarily assume competition among 
universal constraints (cf. Aissen 2003), an adequate approach to weak universals has to take 
language-specific forces into account as well. 
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Abbreviations 


All examples abide by the Leipzig Glossing Rules. Additional abbreviations are: 


DOM differential object marker Ps person 


DYN dynamic REC recent past 
FA factitive RN relational noun 
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It seems to be a robust empirical observation that independent possessive person- 
forms (such as English mine, yours, hers) are always longer than (or as long as) 
the corresponding adnominal possessive person-forms (such as English my, your, 
her). Since adnominal forms are also much more frequent in discourse than in- 
dependent forms, this universal coding asymmetry can be subsumed under the 
grammatical form-frequency correspondence hypothesis (Haspelmath et al. 2014). 
In other words, the fact that independent possessive forms are longer can be seen 
as a functional response to the need to highlight rarer, less predicatable forms. 

In this paper, I present evidence from creole languages and show that irrespec- 
tively of their young age and extremely accelerated grammaticalization processes, 
these high-contact languages confirm the coding asymmetry. Moreover, creole lan- 
guages, just as non-creole languages, show a diverse array of diachronic pathways 
all leading eventually to longer independent possessive person-forms. Such a case 
of multi-convergence of structures through very different diachronic processes 
strongly suggests that the current patterns cannot be explained exclusively on the 
basis of the sources and the kinds of changes that commonly give rise to indepen- 
dent (and adnominal) possessive forms, but that there is an overarching functional 
efficiency principle underlying these coding asymmetries. 


Susanne Maria Michaelis. 2019. Support from creole languages for functional adaptation in gram- 
mar: Dependent and independent possessive person-forms. In Karsten Schmidtke-Bode, Natalia 
Levshina, Susanne Maria Michaelis & Ilja A. SerZant (eds.), Explanation in typology: Diachronic 

| sources, functional motivations and the nature of the evidence, 179-201. Berlin: Language Science 
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1 Introduction 


Languages are functionally adapted to their users' needs in a variety of ways. This 
can be seen in a range of different domains, such as (i) text genres, (ii) social struc- 
ture and (iii) the ecological environment. The genre of informal, spontaneous 
face-to-face communication is reflected in grammatical features of loosely con- 
nected discourse with mainly coordinated or juxtaposed sentences, many hesita- 
tion phenomena, overlapping utterances, and piecemeal structuring of informa- 
tion in accordance with online processing needs, whereas text genres intended 
for formal, planned, out-of-context, written communication show densely inte- 
grated information, multiple syntactic embedding strategies and therefore longer 
sentences, and greater syntagmatic variation (Koch & Oesterreicher 2012[1985] 
[1985 ]). Secondly, languages are adapted to the social structuring of their users, 
for instance to the percentage of second language speakers in a speech commu- 
nity:In a well-known study, Lupyan & Dale (2010) analyzed data from the World 
atlas of language structures (Haspelmath et al. 2005) and found that the greater 
the number of second language speakers in a speech community, the simpler are 
aspects of the morphology of the languages spoken by these communities. In a 
similar vein, Bentz & Winter (2013) found that languages with many second lan- 
guage speakers tend to have fewer morphological cases. And third, it has been 
shown that speakers adapt their languages to their ecological environments, for 
example by using whistled speech in distant communication to overcome the 
background noise of rural environments (Meyer 2005; 2008). 

In the present chapter, I will look at yet another instance of functionally adapt- 
ed linguistic structures: efficiency-based universal coding asymmetries in gram- 
mar, also called form-frequency correspondences (see Haspelmath 2019 [this vol- 
ume]). More specifically, I will discuss one specific universal coding asymmetry 
resulting from asymmetric frequency of use patterns in discourse: the difference 
between dependent and independent possessive person-forms. Independent per- 
son-forms such as mine, yours, hers, and ours are coded with forms that are longer 
than or equally long as dependent possessive person-forms such as my, your, her, 
and our. I claim that the reason for this is a general efficiency principle: Less fre- 
quent and therefore more surprising meanings need more costly coding than 
more frequent and therefore more predicatable meanings. 

Such functional-adaptive explanations have a diachronic component (Bybee 
1988): Since the current system is often rigidly conventional, the adaptive forces 
must have been active in earlier diachronic change. But how can we understand 
such a development? Functionally adapted coding asymmetries, as seen in depen- 
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dent/independent possessive person-forms, are the outcome of hundreds, some- 
times thousands of years of language change processes. These processes reflect 
countless speech acts between interlocutors adding up incrementally and result- 
ing in the crystallization of functionally adapted grammatical structures over 
time. As grammatical change progresses at an extremely slow pace compared 
to other cultural evolutionary processes, the step-by-step changes which bring 
about functionally adapted grammatical structures are often opaque or difficult 
to trace, even in languages with a well-documented written history (see SerZant 
2019 [this volume]). To circumnavigate this difficulty, I will focus on creole lan- 
guages, which are born out of extremely accelerated change processes in the 
context of the European colonial expansion, roughly during the 16! to 20" cen- 
turies. These high-contact languages have evolved their complex grammatical 
structures within only a few hundred years. In this way they are a good test case 
for functional-adaptive change processes because creoles demonstrate in a kind 
of fast motion what happens to grammatical structures under functional pres- 
sures, which in less contact-influenced languages would have taken hundreds 
(or thousands) of years to evolve. In this way, creoles open a unique window on 
grammatical change processes which in these languages can be traced gradually 
from their transparent source constructions to various further grammaticalized 
stages, processes which are supposed to be operative in all languages at all times, 
but which take much more time to proceed in languages less heavily influenced 
by contact. 

I make two main points in this paper: 

(i) Evidence from creole languages indeed confirms the coding asymmetry: 
Independent person-forms are coded with forms that are always longer than, or 
as long as, the dependent person-forms, but never shorter. 

(ii) Creole languages, just as non-creole languages, show a diverse array of 
diachronic pathways all leading eventually to longer independent possessive 
person-forms. Such a case of multi-convergence of structures through very dif- 
ferent diachronic processes strongly suggests that there is an overarching func- 
tional efficiency principle underlying these coding asymmetries (see Haspelmath 
2019 [this volume]). 

After introducing the coding asymmetry in possessive person-forms in 82, 
in $3 I discuss various types of source constructions and diachronic pathways 
which lead to longer independent possessive person-forms. Then in $4, I present 
a range of cases from creole languages and their various diachronic pathways. 
In 85, I consider but ultimately reject some alternative explanations against the 
background of the functional efficiency-based explanation adopted in this article. 
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2 Coding asymmetry: Dependent vs. independent 
possessive person-forms 


Dependent possessive person-forms always occur together with an overt noun 
within a nominal phrase, as in your house, whereas independent possessive per- 
son-forms occur without an overt noun, as in mine. In the latter case, the referent 
ofthe noun is understood from the context because of an anaphoric relationship, 
as in (1a) and (1b), or because of a predicative use, as in (1c). 


(1) English 
a. Your house is bigger than mine. (= than my house’) 
b. Their dog is in a kennel, but ours sleeps under my bed. (= ‘our dog’) 


c. Is this bike yours? (= “your bike") 


In a recent study, Ye (2017)! has found that in the world's languages indepen- 
dent possessive person-forms like English mine, French le mien ‘mine’, and Man- 
darin Chinese wo de ‘mine’ are coded with forms that are longer than or equally 
long as the corresponding dependent possessive person-forms, such as English 
my, French mon ‘my’, or at least not shorter, as illustrated by Mandarin Chinese 
wo de ‘my’. Coding length here refers to the number of segments in the signal, 
or possibly to the amount of biomechanical effort (see Napoli et al. 2014 with 
regard to sign languages). Most importantly, examples of counter-asymmetric 
coding are not attested, i.e. there are no languages where the dependent pos- 
sessive person-forms are longer than independent possessive person-forms, e.g. 
“mine house vs. my ‘mine’. Note that (in)dependent possessive person-form can 
be manifested through a range of language-specific structures, also embracing 
complex forms, such as combinations of articles or adpositions with pronouns, 
as in French le mien and Mandarin Chinese wo de [I GEN]. 

Table 1 shows a number of different types of correspondences between de- 
pendent and independent person-forms in the world's languages: Firstly, many 
languages code the two types of person-forms identically and thus with equally 
long forms, as for instance in Mandarin Chinese. In other languages, the inde- 
pendent person-form has an additional marker compared to the dependent form. 
This can be a substantivizer, as in Lezgian (-di), or an additional stem, as in Ka- 
nuri (kaá-). In some languages the definite article is used to form the independent 
person-form, such as in Italian la mia (with kinship terms like sorella 'sister").? 


'Ye (2017) analyzes a sample of 69 genealogically and areally unrelated languages. 

“If nouns like casa ‘house’ or libro ‘book’ were considered, Italian would be classified just like 
Chinese (identical pattern) because there would be no coding difference: la mia casa ‘my house’ 
vs. la mia ‘mine’, il mio libro ‘my book’ vs. il mio ‘mine’. 
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Yet another synchronic pattern in independent person-forms consists in having 
extra material on the dependent form, as in Coptic p-ó-k [ART-INDEP-2SG] ‘yours’ 
(vs. p-ek-ran [ART-2sG-name] “your name). 


Table 1: Some types of correspondences of dependent and independent 
person-forms 


Pattern type Language Dependent Independent Source 
person-form person-form 
identical Mandarin wo de shu wo de 
Chinese I GEN book I GEN 
‘my book’ ‘mine’ 
additional Lezgian zi ktab zi-di Haspelmath 
marker I.GEN book LGEN-SUBST (1993: 110) 
‘my book’ ‘mine’ 
additional Kanuri fewá-ndé kaá-nde Cyffer (1998: 
stem cow-1PL.POSS INDEP-1PL 31£) 
‘our cows’ ‘ours’ 
additional Italian mia sorella la mia Schwarze 
article ‘my sister’ ‘mine’ (1988: 
44,286f.) 
longer form Coptic p-ek-ran p-ó-k Haspelmath 
ART-2SG-name ART-INDEP-2SG (2015: 277) 


‘yours’ 


‘your name’ 


Apparently the only possible generalization which can be drawn from the ty- 
pological variation is that the independent person-form is always longer than, or 


as long as, the dependent person-form, but never shorter’. 


3 


Now the claim is that these coding asymmetries reflect asymmetries of fre- 
quency of use. More frequent meanings (here: dependent possessives) are more 
predicatable and therefore speakers or signers can reduce the amount of the lin- 
guistic signal in taking into account how much of the signal hearers and receivers 
(in sign languages) need in order to successfully reconstruct the intended mean- 
ing. By contrast, less frequent meanings (here: independent possessives) are in 


3See also Croft (1991), who very similarly predicts “function-indicating morphosyntax" in all the 
atypical combinations of lexical semantic class and pragmatic functions, whereas typical com- 
binations lack function-indicating markers (Croft 1991: 51)), e.g. marked predicative nominals 
vs. unmarked nouns, or marked predicative adjectives vs. unmarked attributive adjectives. 
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need of a greater amount of signal coding for the hearer to be able to infer the 
meaning. 

Indeed, frequency counts of three large text corpora of three different lan- 
guages (English, Korean, and Mandarin Chinese?) confirm the hypothesis that 
dependent and independent person-forms are unequally spread over discourse in 
such a way that dependent possessive person-forms are generally more frequent 
than their independent counterparts. Table 2 shows data from British English. 


Table 2: (In)dependent possessive person-forms in the British National 


Corpus 
Dependent Token frequency Independent Token frequency 
my 145,250 mine 6,067 
your 132,598 yours 4,059 
our 92,314 ours 1,658 


their 251,410 theirs 976 


Interestingly, frequency counts from Mandarin Chinese, a language without a 
coding asymmetry in possessive person-forms, give the same results as counts 
for English and Korean, which have the coding asymmetry in possessive person- 
forms (see Ye 2017). Therefore, the prediction is that we find similar frequency 
distributions of dependent and independent possessive person-forms in all lan- 
guages, independently of whether the universal coding asymmetry is grammati- 
calized or not. 


3 Types of source constructions and diachronic pathways 


As noted earlier, synchronic universal coding asymmetries have a diachronic cor- 
relate because the adaptive forces must have been active in earlier stages of the 
language and have kept shaping grammatical structures according to the func- 
tionally motivated efficiency principle: less predicatable meanings need more 
coding and more predicatable meanings need less coding. 

There is a wide variety of sources and diachronic pathways by which inde- 
pendent possessive person-forms come to be longer than the dependent forms. 
Generally, one can distinguish two scenarios: either the more frequent member 
of the grammatical opposition is shortened (Bybee 2007), or the rarer member of 


“For frequency counts in Korean and Mandarin Chinese, see Ye 2017. 
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the grammatical opposition is lengthened? (Haspelmath 2008). In the shortening 
scenario, speakers assess what hearers can predict and adjust their articulations 
accordingly, resulting in shortening of the signal of the more frequent form of a 
grammatical opposition. In this way, Old English min ‘my’ was eventually short- 
enend to Modern English my, likewise Old Spanish mío was shortened to Modern 
Spanish mi. The Coptic contrast between pêk ‘yours’ and pek ‘your’ that we saw 
in Table 1is likewise attributable to shortening of the earlier full person-form pók 
to pek-. The shortened form became a dependent person-form whereas the old 
from pók became restricted to the independent function (Eitan Grossman p.c.). 

The lengthening scenario can be described as follows: When hearers are in 
danger of making wrong predictions, speakers tend to help them by using forms 
which - compared to the rarer member ofthe opposition - have been lengthened 
with some extra material. One example comes from German, where the indepen- 
dent form der mein-ig-e [DEF 1sG.POSS-INDEP-MAsC.sG.NOM] ‘mine’ is based on 
the dependent form mein ‘my’ plus an additional suffx -ig, which occurs in other 
derived adjectives (like selb-ig ‘same’, bärt-ig ‘bearded’, ehrgeiz-ig ‘ambitious’). 
As we see in Tables 3 and 4, the array of source constructions and diachronic 
pathways which give rise to longer independent possessive person-forms is very 
diverse. 


Table 3: Shortened dependent form 


Language Strategy Dependent form Independent form 


English phonological reduction of my mine 
dependent form 


The different strategies range from the use of a dummy noun (‘my thing’, ‘my 
property’), intensified person forms (‘my own’), the use of adpositions (‘of my’) 
and definite articles (‘the my’) to general nominalizer (‘my one’). One special 
strategy to arrive at longer independent possessive person-forms consists in re- 
cruiting already existing pronominal (lengthened) forms which have been used 
for other grammatical functions. One example comes from Middle English va- 
rieties, where the independent possessive forms her-n, our-n, their-n (still sur- 
viving in English dialects today, see Kortmann & Lunkenheimer 2013) go back 


Here, the term ‘lengthening’ mainly refers to processes by which a given linguistic form is 
expanded or augmented by new lexical or morphosyntactic material. But - in principle - 
lengthening may also pertain to phonological/phonetic processes, such as vowel lengthening 
or gemination. 
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Table 4: Lengthened independent form 


Language 


German 


Arabic 


Greek 
Diu 
Indo-Portuguese 


Albanian 


Berbice Dutch 


Strategy 


affixal lengthening 


dummy noun: 


“property 


intensified person 
form 'own' 


use of 
adposition 'of, for' 


use of definite 
article 


general 
nominalizer 


English (dialectal) exaptation 


Dependent 
form 


mein 
[1sc.Poss] 
-ii 
[1sc.Poss] 


mu 
[1sc.Poss] 


mi 
[1sc.Poss] 
im 
[1sc.Poss] 
eke 


[1sc.Poss], 
[1sc] 


her 
[3sc.F.Poss] 


Independent form 
der mein-ige 

[DEF 1sG.POSS-INDEP] 
milk-ii 


[property-1sc.Poss] 


dhikó mu 
[INTENS 1sG.Poss] 


da mi 

[of 1sc.Poss] 
im-i 
[1sc.Poss-DEF] 
eke-je 
[1sG.poss-NMLZ] 


her-n 
[3sG.Poss-INDEP] 


to erstwhile feminine dative case-marked pronominal forms with the suffix -n 
(hire-n [3SG.FEM.DAT] ‘to her’). In Middle English, such dative forms got re-used, 
or "exapted", to function as independent possessive forms, also under the addi- 


tional analogical pressure from the my/mine and thy/thine oppositions (see Allen 
2002, and for the notion of exaptation, see Lass 1990; 2017; Norde & Van de Velde 
2016 and the discussion below). 

Irrespectively of the shortening or the lengthening scenario, ALL these devel- 


opments result in coding asymmetries which work in the sAME direction: The 


less frequent member (here the independent possessive person-form) is coded 


with a form that is always coded as least as long as the more frequent member 
of the pair, but never shorter. 
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Now how do creole languages fit into this picture? In the next section, I will 
consider possessive person-forms in various creole languages from around the 
world (based on the Atlas of pidgin and creole language structures, Michaelis et 
al. 2013, apics-online.info) to check whether the universal trend identified by 
typological work can be supported by these high-contact languages. 


4 Diverse pathways in creoles 


Before looking at possessive person-forms in creole languages, I would like to 
highlight one characteristic feature of these languages which is crucial for the 
argument put forward in this paper: Creole languages show an unusual amount 
of freshly grammaticalized material due to an accelerated pace of grammatical 
change processes (Haspelmath & Michaelis 2017; Michaelis & Haspelmath forth- 
coming). Examples come from tense-aspect-mood markers, such as the Negerhol- 
lands future tense marker lo < loo ‘go’ < Dutch lopen ‘run’, or the Jamaican ante- 
rior marker wehn < English been. Creoles also show newly grammaticalized case 
markers, such as the dative marker pe in Diu Indo-Portuguese (< Portuguese 
para), the accusative marker ku in Papiá Kristang (< Portuguese com ‘with’), or 
voice markers, such as the reciprocal marker kanmarad in Seychelles Creole (< 
French camarade). The explanation for these widespread newly grammaticalized 
markers appears to be as follows: Speakers communicating in high-contact situ- 
ations which involve many second language speakers tend to rely on extra trans- 
parency of their utterances in order to successfully get their messages across. 
These instances of extra transparency give rise to newly grammaticalized struc- 
tures by refunctionalizing erstwhile content words or otherwise less grammati- 
calized constructions, as seen in the examples cited above. 

Turning to possessive forms, let us now consider the following three guiding 
questions: 


* Do creoles confirm the universal coding asymmetry discussed in this pa- 
per? 


* Does the need for extra transparency translate into freshly grammatical- 
ized constructions also in the domain of possessive person-forms? 


* Which kinds of source constructions give rise to the various possessive 
person-forms? 


“See already Seuren & Wekker (1986) for the notion of transparency in the creolization process. 
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The answer to the first question is a straightforward yes: The creole evidence, 
which comes from 59 creoles world-wide with different lexifier and substrate lan- 
guages (see Haspelmath & APiCS Consortium 2013 and Figure 1in the Appendix), 
confirm the universal coding asymmetry: Independent possessive person-forms 
are coded with forms that are longer than or equally long as dependent posses- 
sive person-forms. Some examples are given in Table 5. 


Table 5: Dependent and independent possessive person-forms in some 
creole languages 


Creole language Dependent form Independent form 
Bislama blong yu blong yu 

(Meyerhoff 2013) [Poss 2sc] ‘your’ [Poss 2sc] ‘yours’ 
Kinubi tá-i tá-i 

(Luffin 2013) [Poss-1sc] ‘my’ [Poss-1sc] ‘mine’ 
Batavia Creole minya minya sua 

(Maurer 2013) [1sc.Poss] ‘my’ [1sc.Poss Poss] ‘mine’ 
Martinican Creole -mwen ta mwen 

(Colot & Ludwig 2013)  [1sc.Poss] ‘my’ [Poss 1sc.Poss] ‘mine’ 
Pichi yu yu yon 

(Yakpo 2013) [2sc.Poss], [2sg] ‘you’ [2sc.Poss own] ‘yours’ 
Palenquero mi ri mi 

(Schwegler 2013) [1sc.Poss] ‘my’ [of 1sc.Poss] ‘mine’ 


The following Table 6 presents a quantitative overview of the different construc- 
tion types found in creole languages of APiCS. Here, only languages with an 
exclusive value assignment are considered (48 out of 59 creole languages). 
Likewise, the answer to the second question raised above is positive: The ma- 
jority of the possessive person-forms are indeed freshly grammaticalized and 
therefore still transparent enough to be traced quite closely with respect to the 
different diachronic processes that have brought about their coding asymmetry. 
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Table 6: Distribution of different construction types over 48 creoles in 
independent possessive person-forms (APiCS Feature 39) 


Coding pattern Feature value Number of creole 
languages in APiCS 
Symmetry Identical to dependent pronominal 20 
possessor 
Asymmetry Special adposition plus pronoun 9 
Other word plus dependent 13 
pronominal possessor 
Special form for independent 6 
pronominal possessor 
Total 48 


Coding asymmetries explicitly allow for the two forms of an opposition to be 
equally long (either overtly or zero-coded)’, as is the case in Mandarin Chinese 
wo de ‘my’, ‘mine’ cited above. As Table 6 shows, there are quite a number of 
creole languages which show this coding pattern, i.e. no length difference in the 
coding of both forms, as for instance in Tok Pisin bilong mi [Poss 1sc] ‘my’, ‘mine’ 
or the related language Bislama (see Table 5). These languages do not contradict 
the universal coding asymmetry, as they do not show the opposite coding pattern, 
ie.longer dependent forms against shorter independent forms. 

Let us now turn to creole languages for which we can attest a coding asym- 
metry in possessive forms. As for the source constructions, I will first look at 
cases of shortening that parallel the English development from mine to my. One 
example comes from Juba Arabic , where the original form bita-i [Poss-1sc] 'my/ 
mine' gets shortened and at some point reanalyzed as the dependent possessive 
tá-i ‘my’, as in ida tái [hand 1sc.Poss] ‘my hand’ (Manfredi & Petrollino 2013), 
whereas the older non-shortened form bita-i continues to be used as the inde- 
pendent possessive form meaning ‘mine’. 

However, the vast majority of asymmetric correspondence types in creole lan- 
guages — as in non-creole languages - follow the second scenario described in 
83: the coding asymmetry comes about by some process of expanding the less 


7See also Croft (1991: 58f.), who calls such cases NEUTRAL evidence. 
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frequent member of the grammatical opposition. One widespread source is the 
use of an adposition going back to ‘of’ or for in one of the European lexifier 
languages French, Portuguese, English etc. An example comes from Portuguese- 
based Santome (Hagemeijer 2013), where the dependent possessive person-form 
mu ‘my’, which is expanded by the genitive preposition ji (< Portuguese de ‘of’), 
gives rise to the independent possessive form ji mu ‘mine’. Jamaican fi-mi ‘mine’ 
is another instance of the lengthening of the dependent form mi ‘1sc.poss’ (and 
also 1sc T) by the preposition fi ‘for’ (< English for). 

A second source construction for independent possessive person-forms in cre- 
ole languages involves the use of a dummy noun, such as ‘part’ or ‘thing’ (as 
mentioned above), as in Haitian Creole pa m nan [part 1sc.Poss DEF] ‘mine’ (lit. 
‘my part’, pa < French part ‘part’) as opposed to dependent forms, such as -m 
(nan) [1sc.Poss (DEF)] ‘my’ in se m [sister Poss.1sc] ‘my sister’. The polysemous 
morpheme pa, which in some contexts still has the original lexical meaning ‘part’, 
has grammaticalized into a possessive form which can also be used in contexts 
where the possessor is stressed, as in (2). 


(2) Haitian Creole (Fattier 2013) 
Liv pa m nan bel. 
book Poss Poss.1sG DEF beautiful 


“MY book is beautiful. 


However, the non-stressed noun phrase would be liv m [book ross.1sc] ‘my book’ 
(Fattier 2013). Here, we clearly see that the postposed morpheme pa in pa m does 
not denote a part of something, but has grammaticalized into a possessive marker, 
as the literal meaning ‘book my part’ is not available for this construction. The 
same holds for the independent possessive form pa m nan 'mine': the meaning 
is not ‘my this part’, but pa has become part of the newly grammaticalized inde- 
pendent possessive form ‘mine’. 

A third source construction for independent possessive forms features an in- 
tensifier which is added to the dependent possessive, as in Krio mi yon [1sc.Poss 
INTENS.OWN] ‘mine’ (the dependent possessive form being mi my) (Finney 2013). 

There is a fourth source of independent forms involving a general (adjectival) 
nominalizer, such as ‘one’. In Berbice Dutch, there is a general nominalizer -je 
which is added to the personal pronoun eke [1sc.Poss]/[1sc] ‘my’ (T), resulting 
in eke-je [1sG.Poss-NMLZ] ‘mine’ (see Table 4). This nominalizer goes back to East- 
ern Ijo, the substrate language of Berbice Dutch, where it has singular nonhuman 
reference, whereas in Berbice Dutch it has grammaticalized into a generic nomi- 
nalizer (Kouwenberg 2013). 
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A fifth source can be illustrated with an example from Reunion Creole, where 
the determiner/demonstrative sa is one of the lengthening elements (besides the 
genitive preposition d) in the independent possessive person-form sa d mwen 
[DEM of 1sc] ‘mine’, compared to the dependent form mon [1sc.Poss] ‘my’. 

In some creole languages the source construction is not known, as in Louisiana 
Creole. Here, the marker kenn is used as a morpheme to code the independent 
possessive person-forms, as in mo-kenn [1sc.Poss-Poss] ‘mine’. This morpheme 
could perhaps be traced back to a 2sG.FEM independent person-form in French 
tienne ‘yours’, which has developed into /kien/, which would then have analog- 
ically spread to the whole paradigm, as in mo-kenn [1sc.Poss-Poss] ‘mine’, to- 
kenn [2sc.poss-Poss] ‘yours’, li-kenn [1sc.Poss-Poss] ‘his’ (Neumann-Holzschuh 
& Klingler 2013, Neumann-Holzschuh p.c.). The unusual feature in this scenario 
is the idea that it is the second-person form which analogically spreads to all 
other persons, and not the more frequent 1sc or 3sc forms. Whether this is the 
right reconstruction of the origin of kenn is not clear. 

Generalizing over all instances of newly grammaticalized independent posses- 
sive forms in creole languages, we can state that irrespectively of the diverse 
source constructions, it is the independent possessive person-form that, in ALL 
instances, is longer than, or as long as, the dependent person-form, but never 
shorter. 


5 Possible alternative explanations 


We have seen that the cross-creole data support the universal coding asymme- 
try in possessive person-forms, and that this synchronic asymmetry can be ex- 
plained by a functional-adaptive constraint of coding efficiency: More frequently 
expressed meanings (dependent possessives) need less costly signal encoding be- 
cause they are highly predicatable, whereas less frequently expressed meanings 
need more robust signal encoding because they are less predicatable (Haspel- 
math 2019 [this volume]; see Norcliffe & Jaeger 2016 and Jaeger & Buz 2018 for 
supporting psycholinguistic evidence in other domains of morphosyntax). Be- 
fore concluding this paper, I will consider several alternative explanations, but 
reject them all as less convincing. 


5.1 Semantics, iconicity, and syntax 


Some functional linguists might argue for an alternative, semantically based or 
iconicity-based explanation here, namely that the independent possessive form is 
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semantically more complex in that it combines possession and referentiality, and 
so additional material has to be adduced in order to express this more complex 
concept, or to compensate for the absence of an overt nominal. 

But I would reject such a proposal because it is not obvious that independent 
possessors are semantically more complex. Rather, we can think of the situation 
as follows: Possessors refer to objects and persons, but at the same time, when 
used in possessive constructions, they also express properties, like adjectives. 
In the most frequent use, possessive forms (again like adjectives) have a mod- 
ification function, as in my house (the *unmarked" use in terms of Croft 1991). 
But when possessive forms are used in the less frequent referential function, as 
in mine, specific marking is needed to highlight this unusual noun-like usage. 
Semantically, there is not really any difference in complexity of both kinds of 
person-forms: dependent possessive forms combine person and property with 
regard to possession in a MODIFICATION function, whereas independent person- 
forms combine person and property with regard to possession in a REFERENCE 
function. There is thus only a difference in the propositional function in which 
the semantic concepts are expressed (modification vs. reference), but there is no 
ADDITIONAL semantic complexity in independent possessive person-forms. 

Likewise, some linguists might argue that the motivation for the coding asym- 
metry is purely syntactic, as the two possessive forms occupy different syntactic 
slots. As the modifier, such as French mon, cannot occur as the head of a NP, 
it has to be transformed into a noun by what Croft (1991: 58f.) calls “function- 
indicating markers”, thus yielding le mien ‘mine’ in French. The use of the defi- 
nite article represents one of the lengthening processes in independent posses- 
sive person-forms that I described above. But I would interpret the mere use of 
function-indicating markers as the frozen grammaticalized results of hundreds 
or thousands of years of speakers performing communicatively efficient speech 
acts by marking the less predicatable meanings with more elaborate linguistic 
matter. In this respect, there is no contradiction between today's syntax and yes- 
terday's (and earlier) speakers' preferences to highlight less predicatable mean- 
ings by more morphosyntactic material, which accumulated over generations 
and eventually contributes to the shaping of syntactic categories (see Norcliffe 
& Jaeger 2016: 1715). 


5"Communicative efficiency therefore holds explanatory potential not just for patterns of real- 
time language use, but also for the shape of grammars" (Norcliffe & Jaeger 2016: 171. 
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5.2 Diachronic change as a possible explanatory factor 


Yet a different type of explanatory account might propose that the diachronic 
origins of the relevant patterns give rise to the observed cross-linguistic distribu- 
tions (see Cristofaro 2017, and Cristofaro 2019 [this volume]). The claim would 
be that the kinds of sources and diachronic pathways that bring about the ob- 
served patterns are tightly constrained (mutational constraints, see Haspelmath 
2019 [this volume]) and, crucially, that the coding asymmetry is a direct but in- 
cidental result of how independent possessive person-forms emerge from their 
respective sources. 

The strongest argument against such a possible claim, and for an interpretation 
ofthe data in terms of a functional-adaptive, result-oriented approach, is the fact 
that we see convergence of multiple sources and pathways toward a UNIFORM 
outcome. In particular, the asymmetric coding can come about through shorten- 
ing or through lengthening. If there were no overarching functional constraint, 
we would expect many more counter-examples in the data, i.e. cases where the 
dependent possessive person-forms are longer than the independent ones, such 
as dependent *mine book vs. independent *my 'mine', or German dependent 
*mein-iges Buch ‘my book’ vs. independent “mein ‘mine’, or Jamaican dependent 
*fi-mi buk ‘my book’ vs. independent “mi ‘mine’. But this is not what we find. 

The creole data make clear that there is a surprisingly large array of source 
constructions which enter the pool of possible dependent and independent pos- 
sessives. Many of these source constructions had different communicative func- 
tions when they were first grammaticalized. The use of a dummy noun ‘part’, for 
instance, which is the source of current Haitian Creole independent possessive 
pa m nan ‘mine’, may have started out as a predicative focus construction, such 
as 'this is MY part’. This focussing function is still present in constructions like in 
example (2). But at some point, the morpheme pa got refunctionalized into the 
phrase pa m nan, which eventually got grammaticalized into the independent 
possessive person-form 'mine'. How did this happen? I assume that speakers 
must have somehow felt that they needed a more elaborate, more fully marked 
form to convey to hearers that a less predicatable meaning (independent posses- 
sive) was expressed. Therefore they chose (elements of) an already existing con- 
struction, here the focus construction, and through a kind of inflationary overuse 
grammaticalized it into the independent possessive form pa m nan, where the 
morpheme pa does not have the meaning ‘part’ anymore. It is only at this mo- 
ment that speakers created a grammatical opposition between a dependent and 
an independent possessive form. 
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Another source of a longer independent possessive person-form is the use of 
a preposition 'of/for' together with a possessive person form 'my/T, yielding 
complex forms, such as ‘of my’ or ‘for me’, as seen in the Jamaican independent 
possessive form fi-mi ‘mine’ (vs. dependent possessive mi [1sc.Poss] ‘my’/[1sc] 
T, already cited above). Forms like fi-mi may go back to a kind of predicative 
construction, such as ‘this is for me/this is of my’. But here again, at some point 
in time, the creators of Jamaican refunctionalized the chunk fi-mi to fit the need 
to highlight the more unusual, less predicatable independent possessive meaning 
‘mine’. 

In this context, another fact makes a source-oriented account less convincing. 
Quite a few creole languages show lengthened forms, such as fi-mi, not only 
in the independent, but also in the dependent possessive person-form, as for 
instance in Zamboanga Chabacano dimiyo (‘of.1sc’) ‘my/mine’ or in Tok Pisin 
bilong mi (of.1sc) ‘my/mine’. This is the situation where there is no length differ- 
ence in both forms, as illustrated for Mandarin Chinese in §2 (identical pattern 
in Table 1). If a hypothesized predicative construction were the source of the 
independent possessive person-forms, it certainly cannot be the source for the 
dependent form. Therefore, here we must allow for some kind of analogical exten- 
sion to the dependent forms. Interestingly, it is only in the dependent possessive 
function that dimiyo can be shortened to mi (Steinkriiger 2013), thus again giv- 
ing rise to a new coding asymmetry in the predicted direction: the independent 
possessive form dimiyo ‘mine’ is longer than the dependent possessive form mi 
(similar to English mine/my and Juba Arabic bitai/tai). 

Coming back to both lengthening scenarios of independent possessive forms 
described above: The crucial point here is the fact that the change process from 
a focus or predicative construction to an independent possessive form should 
not be seen as a self-propelling grammaticalization process, but as a result of 
speakers’ unconscious choices to communicate efficiently by highlighting the 
less predicatable meaning, thus ultimately bringing about functionally adapted 
linguistic structures. In other words: If speakers did not sense the communica- 
tive need to mark independent possessives with more linguistic material, they 
would not drag parts of a focus or predicative construction into an emergent 
independent possessive person-form in the first place. 

Therefore, speaking of SYNCHRONIC “lengthening” strategies in independent 
possessive forms, as I have done in the previous sections, could be misinter- 
preted. What generations of speakers really do while communicating is recruit- 
ing ALREADY EXISTING structures (lexical or grammatical) to fit new grammati- 
cal functions (parts of old focus constructions and old predicative constructions 
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are used to express new independent possessive forms). Linguists subscribing to 
the source-oriented approach would probably completely agree with this state- 
ment. But, as I laid out in the preceding paragraph, there is a second part to this 
story, where mere persistence accounts fail to explain the data: While recruit- 
ing existing structures for new grammatical functions, speakers unconsciously 
comply with the efficiency principle. As a result of the cumulative individual 
speech acts, we observe ever changing functionally adapted structures, which 
overwhelmingly point into the sAME direction: rarer, less predicatable meanings 
tend to be coded with longer forms than, or equally long forms as, the more 
predicatable meaning, but never with shorter forms. 

Moreover, the examples of Haitian Creole pa and Jamaican fi-mi make clear 
that a functional-adaptive approach in terms of coding efficiency has no problem 
with the fact that the function or motivation of the source construction, here a 
focus or predicative construction, is different from the function at the synchronic 
level, here the independent possessive meaning. However, what is important is 
the fact that speakers always refunctionalize existing lexical or grammatical ma- 
terial in a predicatable way. In many cases, the newer grammatical functions 
that are expressed with already grammaticalized material follow quite narrow 
grammaticalization paths. In other more extreme cases, speakers exapt existing 
grammatical material to make it fit to their communicative needs, i.e. highlight- 
ing less predicatable meanings. This is the case with the erstwhile Middle English 
dative case form hern that was exapted into the independent possessive form (see 
83). The mere existence of such exaptations in grammatical change supports the 
idea that the source constructions can be irrelevant for the synchronic grammat- 
ical patterns. But what is indeed effective in every utterance and gives rise to 
universal coding asymmetries is the overarching functional efficiency principle 
in signal coding: Spend as little energy as necessary to reach the intended goals, 
from which it follows that less frequent and therefore less predicatable meanings 
come to be coded with more material than more frequent and therefore more 
predicatable meanings. 

Thus, creole languages help sharpen our understanding of functional-adaptive 
forces unfolding in situations of unusually accelerated language change. 
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Figure 1: Distribution of the 59 creole languages in APiCS (for more in- 
formation see apics-online.info) (CC BY-SA 4.0, Hans-Jórg Bibiko, MPI- 
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Chapter 9 


Linguistic Frankenstein, or How to test 
universal constraints without real 
languages 


Natalia Levshina 
Leipzig University 


The scarcity of diachronic data represents a serious problem when linguists try 
to explain a typological universal. To overcome this empirical bottleneck, one can 
simulate the process of language evolution in artificial language learning exper- 
iments. After a brief discussion of the main principles and findings of such ex- 
periments, this paper presents a case study of causative constructions showing 
that language users have a bias towards the efficient organisation of communica- 
tion. They regularise their linguistic input such that more frequent causative situ- 
ations are expressed by shorter forms, and less frequent situations are expressed 
by longer forms. This supports the economy-based explanation of the universal 
form-meaning mapping found in causative constructions of different languages. 


1 Problems with testing functional explanations 


Functional linguists have formulated many universal principles that are meant 
to explain the structure and use of human languages, such as the principles of 
economy, iconicity, cognitive complexity, minimization of domains, avoidance 
of identity, and so on (e.g. Haiman 1983; Rohdenburg 1996; 2003; Hawkins 2004; 
Haspelmath 2008). How can one decide which explanation is relevant for a cer- 
tain cross-linguistic pattern and how does one make sure that the latter is not 
a result of a historical coincidence in the sense of Collins (2019 [this volume])? 
Ideally, we would need data from genealogically and geographically diverse lan- 
guages over a large time span. Needless to say, this is unrealistic: as a rule, such 
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data are not available. The time depth and typological breadth of available di- 
achronic data are very limited. Moreover, even in an ideal world where any kind 
of linguistic data is obtainable at the click of a button, this might still be insuffi- 
cient. First, real language data are observational, which makes a causal interpre- 
tation of the correlational results rather difficult (this does not mean there are 
no successful attempts, e.g. Moscoso del Prado 2014). Second, real language is 
a battleground of various forces, many of which can be mutually exclusive, e.g. 
over- and underspecification, iconicity and economy-driven arbitrariness, and so 
on. Disentangling these factors in real ‘messy’ language data is not a trivial task. 
Moreover, as pointed out by Smith et al. (2017) in their discussion of the univer- 
sal bias against free variation, transmission of language in populations can mask 
the biases of learners: the language in a population might retain variability even 
though every learner is biased against acquiring such variation. Unless the data 
contain meta-information about the speakers, these effects may go undetected. 

These problems can be solved with the help of the artificial language learn- 
ing paradigm, which has gained popularity recently. One can observe in real 
time how linguistic systems undergo change, revealing the cognitive and com- 
municative biases of language users. One can control for some factors while test- 
ing those of interest, and study the behaviour of each individual speaker within 
a population. Similar to the protagonist of Mary Shelley's gothic novel, Victor 
Frankenstein, who created a sentient living creature in his laboratory, a linguist 
can design a new language and watch it develop. 

Moreover, there have been quite a few experiments that put to test typological 
universals, such as Greenberg's Universal 18 about harmonic word order within 
the NP (Culbertson et al. 2012), the suffixing preference (St. Clair et al. 2009), defi- 
niteness hierarchy (Culbertson & Legendre 2011) or the bias towards consistency 
in head-dependent order (Christiansen 2000). In the present paper, however, I 
will focus on the experiments that demonstrate more abstract functional and 
learning biases, which, in their turn, can be used to explain language universals 
and language-specific phenomena. An overview of the main principles and dis- 
coveries of artificial language learning with human subjects is provided in 82. To 
illustrate the approach, I will also present the results of a recent study, which 
tests the principle of economy on artificial causative constructions (see 83). A 
brief summary and outlook are provided in 94. 
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2 The artificial language learning paradigm 


2.1 Main types of artificial language learning experiments 


There are several popular types of artificial language learning experiments (see 
Figure 1). First of all, learning can be iterated and non-iterated. In non-iterated 
learning, one can only study the individual process of acquisition. There is no 
further language transmission. In iterated learning, a subject learns a certain 
linguistic behaviour by observing the behaviour of one or more subjects who 
learnt it the same way, i.e. in the process of implicit induction and production 
(Kirby et al. 2014). The output of one generation of speakers serves as the input for 
the next one, similar to the transmission of real language and culture in general. 


ALL 


NR 


non-iterative iterative 


rM 


interactive non-interactive 


me. 


dyads microsocieties 


Figure 1: Main types of artificial language learning experiments. 


Some communicative and learning biases may be strong enough to be detected 
in non-iterated learning. Sometimes even one generation is enough to radically 
change the language (Hudson Kam & Newport 2009). Weaker biases may require 
several generations in order to manifest themselves (e.g. Reali & Griffiths 2009; 
Smith & Wonnacott 2010). 

Iterated learning can be further subdivided into interactive and non-interac- 
tive (cf. Tamariz 2017). In non-interactive designs, one creates transmission chains 
where one subject’s output is another subject’s input. There is no actual interac- 
tion between the subjects. No common ground is created, and no feedback is 
given. Interactive experiments involve dyads of interacting users or even mi- 
crosocieties, where everyone interacts with everyone else (Tamariz 2017). Lan- 
guage is transferred from one dyad to the following one, or from old members 
of a microsociety to the new ones. By using this approach, one can preserve 
common ground and feedback, which are crucial in everyday communication 
(Caldwell & Smith 2012). 
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The artificial language learning paradigm is very flexible, allowing for inves- 
tigation of diverse forms: non-existent words, whistles, graphical scribbles. A 
language can also be fully artificial or semi-artificial. For instance, Smith & Won- 
nacott (2010) use some lexical items (nouns) from English, but novel verbs and 
plural noun markers. Usually, it is assumed that the results based on various me- 
dia are comparable, although some recent studies suggest that the role of univer- 
sal constraints (e.g. iconicity and compositionality) varies across different media 
(e.g. manual signs vs. sounds in Little et al. 2017). 

Crucially, the studies based on artificial language learning share one funda- 
mental assumption. Namely, those linguistic features that are easier to learn and 
use in communication will spread at the expense of less “fit” alternatives (Smith 
et al. 2017). By adjusting the linguistic input in a similar way, language users 
reveal their communicative and learning biases, which are so strikingly similar 
that one can speak about universal preferences. 


2.2 Evidence of universal constraints from artificial language learning 


The main results of recent studies can be concisely and non-exhaustively pre- 
sented in a list of the following universal biases: 


1. A bias towards arbitrariness (as opposed to iconicity), conventionalization 
and simplification of signs in interaction (e.g. Caldwell & Smith 2012). Sim- 
plified arbitrary signs are easier to select, produce and replicate than more 
complex iconic signs. At the same time, symbolic signs are more difficult to 
learn at first encounter, while iconicity seems to enhance the learnability 
of signs for new group members, as shown by Fay & Ellison (2013). They 
also found that the semiotic systems of larger populations reach a kind of a 
compromise: they favour simple iconic signs, i.e. those that are minimally 
complex and maximally informativeinformativity. 


2. A bias towards combinatorial structure, when meaningless elements 
(which serve as basic building blocks) are combined in higher-order units. 
This is also known as duality of patterning (Verhoef 2012). 


3. A bias towards compositional structure of syntax (Kirby et al. 2008). Dur- 
ing the process of iterative learning, language becomes more structured. 


4. A bias towards discrete structure as opposed to holistic signals. For exam- 
ple, in an iterated language learning experiment with a language based on 
whistles, participants come up with categorical distinctions, rather than 
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paying attention to the precise acoustic realizations, e.g. in terms of pitch 
(Verhoef 2012). 


. A bias towards regularity. Languages exhibiting free variation become 
increasingly regular, revealing a strong bias towards regularity in adult 
learners (Smith & Wonnacott 2010). This bias may be obscured by so-called 
probability matching: in a language in which two forms are in free varia- 
tion, adult learners have also been found to produce each variant in accor- 
dance with its relative frequency in the input (Hudson Kam & Newport 
2009; Wonnacott & Newport 2005). The interplay between regularization 
and probability matching depends on the frequency distribution. The more 
forms with lower frequencies are used as free variants of the main form, 
the more scattered the pattern and the stronger the bias towards produc- 
tion of the main form (Hudson Kam & Newport 2009); 


. A bias towards economy and communicative efficiency, when more pre- 
dictable information gets less formal coding, and less predictable informa- 
tion gets more formal coding. This bias has been observed in a study of 
differential case marking (Fedzechkina et al. 2012). The hypothesis is that 
a referential expression should be more likely to receive overt case marking 
when its intended grammatical function is less expected. The experiment 
shows that learners deviate from the initial input to make the language 
more communicatively efficient; 


. A bias towards underspecification of irrelevant conceptual dimensions. 
Silvey et al. (2015) have found that their artificial language, which was orig- 
inally fully specified in the sense that it had a unique label for each object, 
became underspecified by losing contrasts across irrelevant dimensions, 
i.e. those that are not important for discriminating between the stimuli. In 
contrast, Tinits et al. (2017) found a bias towards overspecification and re- 
dundancy in the contexts when the relevant dimensions were difficult to 
discern. 


A key question is whether these biases are due to higher learnability or com- 
municative advantages of the preferred features, or both. Using the terms from 
Haspelmath (2019 [this volume]), are we dealing with acquisitional or functional- 
adaptive constraints? 

Itis clear from the existing evidence that more learnable systems are not neces- 
sarily the ones that are also more usable, and the other way round. As was shown 
above, arbitrary signs, which are more usable in interaction, are less learnable 
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than more iconic ones. One can find a similar clash between learnability and us- 
ability with regard to regularization and underspecification. As found by Kirby et 
al. (2008), Verhoef (2012) and others, languages that are more regular and compo- 
sitional are easier to learn and are more successfully passed from one generation 
to another. Their studies demonstrate that the learning errors decrease with time 
(number of generations), as compositionality and regularity increase. At the same 
time, such emerging systems also exhibit greater ambiguity because the number 
of lexical items drops. As a result, the languages become increasingly underspec- 
ified, which would reduce their usability. 

Interestingly, it has been claimed that children tend to regularize, or system- 
atize more strongly than adults (Hudson Kam & Newport 2009). This finding has 
been attributed to children having less cognitive resources than adults — in partic- 
ular, memory limitations. However, Smith et al. (2017) do not find this argument 
very convincing because, as they claim, memory limitations do not always lead to 
regularization. Alternatively, one may suppose that adult learners may be better 
at conforming to social expectations and norms. In general, there are important 
differences in the emergent languages depending on the social circumstances 
of communication. For instance, Perfors (2016) observes that adults regularize 
strongly when they believe that the variation is unpredictable (i.e. they are told 
that the previous person was under time pressure and might have made a few 
errors), than when they are asked to match an imaginary output of another per- 
son, who is believed to be performing the same naming task at the same time. 
When the participants believe that the variation is predictable (even if they do 
not know what it actually depends on), and their goal is to learn the language as 
closely as possible, they do probability matching more and regularize less. There 
is also evidence that speakers produce more regular language when they believe 
they are addressing a person, even though they are in fact communicating with a 
computer (Fehér et al. 2016). Apparently, speakers believe that producing a more 
regular language will facilitate communication with their human partner. Simi- 
larly, Little (2011) discovered that morphosyntactic complexity decreases when 
expert participants, who have been trained in an artificial language, interact with 
naive ones who have little knowledge of the same language. This effect, how- 
ever, was not observed when the experts interacted with other experts. Thus, the 
emergent language system depends on the social circumstances and pragmatic 
goals of the subjects. This relationship, which seems to be present already at the 
learning stage, makes a neat separation of acquisitional and functional-adaptive 
constraints a very challenging task. 
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In the remaining part of the paper, I will focus on the bias towards communica- 
tively efficient, economical form-meaning mapping, using a non-iterative online 
experiment. 


3 Case study: Frequency effects in causative constructions 


3.1 Hypothesis: Economy and formal length 


As was shown above, there is ample evidence that language learners are generally 
sensitive to frequency information. In this case study, I focus on the claim that 
more frequent situations are expressed by means ofless coding material than less 
frequent ones. Such differences are predicted by the principle of economy and - 
broadly speaking - the principle of communicative efficiency. According to these 
principles, more predictable information needs less coding material than more 
predictable information. The experiment in Fedzechkina et al. (2012), which was 
mentioned above, demonstrated the effect of predictability based on semantic 
categories. In my own study, I want to focus on predictability based on frequency 
information. To the best of my knowledge, these effects have not been tested 
previously in artificial language learning experiments. 

Causatives serve as convenient and well-studied material for testing the bias in 
question. There is a cross-linguistic correlation between form and meaning: more 
formally integrated causatives, such as lexical causatives kill or break, tend to 
denote more integrated causing and caused events than less integrated forms, 
such as cause to die or make break, 44. AS Comrie (1981: 165) puts it, “the kind 
of formal distinction found across languages is identical: the continuum from 
analytic via morphological to lexical causative correlates with the continuum 
from less direct to more direct causation". Consider the example in (1): 


(1) English (personal knowledge) 
a. John killed Bill in his mansion on Tuesday... 
i. ...?? by shooting him in the forest on Monday. 
ii. ... ?? by tampering with his gun. 
b. John caused Bill to die in his mansion on Tuesday... 
i. ... by shooting him in the forest on Monday. 
ii. ... by tampering with his gun. 


In this example, the lexical causative kill expresses direct causation with high spa- 
tiotemporal integration of the causing and caused events (John’s killing and Bill’s 
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dying, respectively) and with direct impact of the Causer (John) on the Causee 
(Bill), whereas the analytic causative (cause to die) expresses indirect causation 
without spatiotemporal integration ofthe events and without direct impact of the 
Causer. This correlation between conceptual and formal integration of events has 
also been found in a large typologically diverse sample of languages (Levshina 
20182). 

Haspelmath (2008) suggests an alternative account of this correlation based on 
the principle of economy: more frequent forms are usually shorter, whereas less 
frequent ones tend to be longer. As my current corpus-based work shows (Lev- 
shina 2018b: Ch. 3), the frequencies of direct causation and related properties (e.g. 
lack of autonomy on the part of the Causee, implicative causation, factitive cau- 
sation, etc.) are substantially higher than those of indirect causation. Thus, these 
parameters (conceptual integration, formal compactness and relative frequency) 
are intercorrelated: more compact causatives represent both more frequent sit- 
uations and more integrated events, whereas less compact causatives represent 
less frequent situations and also less integrated events. This creates a situation 
in which it is very difficult to decide based on observational data alone which 
of the functional principles actually explains the cross-linguistic correlation be- 
tween formal and conceptual integration, i.e. iconicity or economy. The purpose 
of the present study is to test whether the economy effect is still observed when 
the iconic correspondence is not present. 


3.2 Design and procedure 


The participants of the experiment were asked to learn an alien language. At the 
beginning, they read an introduction: 


In this experiment you will learn the lingua franca of a highly developed 
civilization that exists on a planet in a galaxy far, far away... The planet is 
called Atruur. Its only vegetation form is called ‘grok’. It is similar to a cactus 
and is used by the Atruurians for food, as fuel for their flying vehicles and 
for entertainment. Because the Atruurians traditionally detest any form of 
physical activity, they have developed a technology for teleportation and 
telekinesis. 


The introduction also mentioned that the word order is SV (for intransitives) or 


SOV (for transitives). To explain that to non-linguists, examples were provided, 
which are shown below for illustration: 
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(2 Grok babum. 
cactus grow 


“A grok (cactus) grows: 


(3) Sia grok hum. 
Atruurian cactus see 


“An Atruurian sees a grok (cactus). 


The subjects were first asked to learn the language by copying the sentences 
in Atruurian that describe situations shown in video clips. At first, they saw four 
situations: a cactus-like plant appears, disappears, grows and shrinks in size. The 
goal of that task was to introduce the basic vocabulary. 

Next, the participants saw 32 causal situations, which represented a causal 
version of the same situations. In each of these causal situations, there was a 
flying saucer (sometimes with an alien inside) which hovered above the plant 
and flashed a yellow or blue light three times in a row. As a result, the plant 
either appeared, disappeared, grew or shrunk. Varying types of saucers were 
shown. 

Crucially, the subjects saw two types of causing events. The first of them in- 
volved the saucer flashing a yellow light above the plant. The other one was 
when the saucer flashed a blue light from the left of the plant. The yellow-light 
causing event was three times as frequent as the blue-light causing event (i.e. 
75% vs. 25%). The distribution of the four caused events was the same for each of 
the causing events. There were no reasons to assume that one type of causation 
is more or less direct than the other. The colour and the position of the Causer 
with regard to the Causee are not mentioned in the semantic parameters that 
are distinctive of different causative constructions in the languages of the world 
(Levshina 2018b: Ch. 3). 

As for the artificial language, the most important thing is that each causing 
event is represented by two allomorphs. One ofthe causing events was associated 
with the forms tere- or te-, as in (4), and the other one was described by using the 
forms gara-/ ga-. 


(4) a. Sia grok te-babum. 
Atruurian plant CAUS-grow 


"Ihe Atruurian caused the plant to grow (by flashing with yellow 
light from above): 
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b. Sia grok tere-babum. 
Atruurian plant CAUS-grow 


"Ihe Atruurian caused the plant to grow (by flashing with yellow 
light from above)? 


Note that the form-meaning mapping varied across the subjects. That is, for 
some of them, te-/tere- denoted the causing event with yellow light flashed from 
above, whereas the ga-/gara- forms were used for the causing event with blue 
light flashed from the left of the plant. For the others, this was the other way 
round. The prefixes were evenly distributed among the stimuli, so that there was 
truly free variation. There was no condition in the experimental design that could 
explain the preference for the longer or the shorter form. 

One should mention here that free variation is less exotic than it seems. It oc- 
curs in the language of late learners of a second language, e.g. hearing parents 
of a deaf child who learn to sign, or during the emergence of a new language, 
e.g. Tok Pisin and other pidgins and creoles (see an overview in Hudson Kam & 
Newport 2009). This is why the input language is not completely outlandish from 
a functional point of view. However, since language users have a bias towards 
regularization and against free variation, I expected that the subjects would reg- 
ularize the free variation in the input, preferring the short allomorphs to convey 
the frequent causing events, and using the long allomorph to express the rare 
causing events. 

After the training session, the subjects were asked to describe in Atruurian 
what is going on in video clips. The stimuli represented a selection from the 
previous stimuli: each of the caused events was presented with causing event A 
and causing event B. In total, there were eight test situations. 

The experiment was performed online, using Google Forms with built-in You- 
Tube videos. The latter were created with the help of Adobe Animate CC software 
by myself. Figure 2 demonstrates four fragments from one of the video clips, with 
the causing event A and the caused event of disappearance. 


3.3 Participants 


The participants were recruited via my personal network and LinguistList. Most 
of them had a background in linguistics or languages. After the experiment, they 
were asked about the aims of the experiment. None of them guessed the true pur- 
pose. Overall, I obtained responses from 84 participants. Some of the responses 
were removed. This was the case if the participants did not follow the training 
procedure instructions (e.g. a participant did not type in the training sentences), 
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Figure 2: Fragments from one of the video clips. 


or if the output was unanalysable. As a result, I had 554 valid data points from 
70 participants. 

The participants with valid responses had different L1s, but mostly had a Slavic 
and Germanic linguistic background. There were 40 native Czech speakers, 12 
native German speakers, 7 native English speakers, 2 Dutch speakers, 2 Italian 
speakers, as well as native speakers of Brazilian Portuguese, Croatian, Danish, 
Polish, Russian, Slovak and Turkish. None of these languages has productive 
causative prefixes. 


3.4 Results of the artificial language learning experiment 


The counts aggregated across all participants are presented in Table 1. Lexical and 
spelling errors were ignored. Figure 3, which visualizes these counts, shows that 
there is a difference between the proportions of short and long forms expressing 
the frequent and rare causing events. The short forms are overall more preferred 
than the long ones, but the situations with the more frequent causing event are 
more frequently expressed by the short forms in comparison with the situations 
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that involve the rare causing event, where the proportions of the short and long 
forms are almost equal. 


Table 1: The number of forms selected and their marginal sums. 


Frequent Rare Total 
Short 168 137 305 
Long 109 140 249 


Total 271 271 554 


Form 
a Long 
B Short 


Proportion 
oOo 
a 
Oo 


Freguent Rare 
CausingEvent 


Figure 3: Proportions of short and long forms. 


A closer look at the individual subjects' preferences reveals that most of them 
use both long and short forms. Seven subjects produced only the short forms. 
There were no subjects who always preferred the long forms. The distribution is 
shown in Figure 4. 

The main question, however, is whether the choice of forms is influenced by 
the type of causing event. In order to test this, I fit a generalized linear mixed- 
effects model with logit as the link function (R package Ime4, function glmer, 
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Number of long forms 


Figure 4: Individual preferences for the long and short forms 


Bates et al. 2015). The type of prefix - long or short - was the response variable. 
The individual participants were treated as random effects (intercepts). There is a 
significant effect of the type of causing event: if the event is rare, the odds of the 
longer form to be chosen are 1.66 times greater than when the event is frequent 
(log-odds ratio b = 0.501, p = 0.006). This result supports the hypothesis that 
speakers have a bias towards the use of shorter forms to represent more frequent 
situations, and longer forms to represent less frequent situations. Random slopes, 
which represented individual differences in the effect of the predictor on the 
response, were tested as well, but they did not improve the explanatory power 
of the model. 

The likelihood ratio test, a standard tool for variable selection and model com- 
parison in regression analysis, demonstrates that the caused event does not have 
a significant effect on the choice of form (p - 0.84), and does not interact with 
the type of causing event (p = 0.6). This means that lexical conditioning can be 
excluded (cf. Smith & Wonnacott 2010). 
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4 Discussion 


This paper has provided an overview of the applications of the artificial language 
learning paradigm in testing universal biases suggested by functional and cogni- 
tive linguists. One of them, known as the principle of economy, was tested in an 
online experiment. The results demonstrate that frequent causative situations are 
more commonly expressed by shorter forms, whereas the subjects are more tol- 
erant of longer forms when expressing rarer causative situations. Therefore, the 
results of previous corpus-based studies, typological evidence and experimental 
approaches converge. The fact that the effect was detected in a non-iterative ex- 
periment with only one "generation" of language learners, suggests that the bias 
is very strong. 

An important question remains about the nature of this bias and its place 
in Haspelmath's (2019 [this volume]) classification. Can it be characterized as 
a functional-adaptive, acquisitional, mutational or maybe even representational 
constraint? The mutational type can be discarded because we do not have any 
qualitative changes in the constructions (e.g. possessed nouns becoming adpo- 
sitions). As for representational constraints, they reflect the properties of the 
innate language faculty. Even if we accept that economy is an innate principle, 
since humans and other species are genetically programmed to gain maximal 
benefits from their behaviour at minimum costs (cf. Parker & Smith 1990), it repre- 
sents a domain-general bias that is not restricted to human language only. There 
is evidence that the evolution of sense organs and brains is driven by the need 
to minimize the energy spent for each bit of information received from the envi- 
ronment (Stone 2015: 3). This is why linguistic economy is not a part of Universal 
Grammar in the generativist sense. 

Thus, we are left with the functional-adaptive and acquisitional types. It is true 
that Haspelmath (2019 [this volume]) defines the latter as related to L1by children 
only, but the overview presented in $82 demonstrates that learnability constraints 
can also be detected in artificial language learning by children and adults. 82 also 
showed that a clear distinction between communicative efficiency and learnabil- 
ity is often problematic within the artificial language learning paradigm. Similar 
to Slobin's (1996) famous "thinking for speaking", we can also speak about "learn- 
ing for using". This makes the task of distinguishing between these types very 
difficult. My preliminary answer is that we are dealing with a functional-adaptive 
constraint because it helps to optimize communication, even though there is no 
immediate interaction in the experiment. Obviously, more research with a clearer 
separation between the learning and communication stages is needed. The most 
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pertinent question at this stage is the following: Is it easier to learn a more com- 
municatively efficient language, in which frequent meanings are expressed by 
shorter forms, and rare functions are expressed by longer forms, than a less ef- 
ficient one, in which frequent meanings are expressed by longer forms and rare 
ones by shorter forms? 

The artificial language learning paradigm, as I tried to demonstrate in this pa- 
per, represents a valuable addition to the toolkit of typologists and functional 
linguists. However, there are a few caveats that need to be mentioned. First, 
the experiments involve very limited interaction, if any, in an artificial context. 
Second, the populations are extremely small. Third, even when a fully artificial 
language is used, one cannot exclude transfer effects from real language. For 
example, Goldberg (2013), in her critical evaluation of Culbertson et al.'s (2012) 
study of Greenberg's Universal 18, argues that the word orders Adj + N and N 
* Adj can be transferred either from English (e.g. a blue bird vs. something red, 
all things nice) or from the Spanish-type languages. Note, however, that Cul- 
bertson et al. studied a specific formal pattern. As we move to more abstract 
properties of language or communicative behaviour, such as compositionality or 
economy, it becomes more difficult to explain these properties by transfer from 
real languages, although one cannot exclude a possibility that these biases rep- 
resent very abstract generalizations from the users' experience with language, 
and their intuitive expectations about what a “normal” human language should 
be like. This uncertainty will probably always loom until we find a real alien and 
have it learn an artificial language. 
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Chapter 10 


Diachronic sources, functional 
motivations and the nature of the 
evidence: A synthesis 


Karsten Schmidtke-Bode 
Leipzig University and Friedrich Schiller University Jena 


Eitan Grossman 


Hebrew University of Jerusalem 


In this epilogue, we summarize and reflect on the major threads and arguments 
from the individual contributions to this volume ($1), and we also briefly outline 
some challenges and directions for future work on the topic (82). 


1 The voices of the present volume 


Above all, this volume has been concerned with the theoretical status of the his- 
torical processes that lead to statistical universals of grammatical structure. In 
the typological research programme initiated by Greenberg, it was recognized 
from early on that a “dynamic” perspective on universals is vital (e.g. Greenberg 
1969), and it was also occasionally suggested that the diachronic origins of the 
structures in question may themselves be able to explain current universal pat- 
terns (e.g. Givón 1975; Greenberg 1978). 

In the last decade or so, this position appears to have become more popular and 
to be pursued more systematically, and it is now often explicitly contrasted with 
the widespread view that universal grammatical patterns reflect adaptations to 
language users' needs, such as a preference for iconic or efficient grammatical 
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structures. As argued in the lead article to the present volume, this latter posi- 
tion can be characterized as “result-oriented”: Languages develop similar traits 
by virtue of their users selecting and propagating more adaptive variants over 
less adaptive ones.! Accordingly, there is an important diachronic element in 
this line of explanation, but the individual diachronic processes that bring the 
synchronic results about "merely serve to realize the adaptation" (Haspelmath, 
p. 9; see also Hawkins (2004: 266) for a similar formulation). This is the crucial 
difference from the "source-oriented" approach, in which synchronic typologi- 
cal patterns are explained directly in terms of their source constructions: They 
are seen as long-term reflections of the particular ways in which each structure 
originally emerged, and it is argued not only that these "birth processes" do not 
provide evidence for an overarching functional-adaptive force being at work (e.g. 
Cristofaro's paper), but that there is simply no need to appeal to such forces be- 
cause the diachronic development produces the synchronic pattern as a natural 
by-product that persists into the present time (e.g. Cristofaro's, Collins's and 
partly also Dryer's and Diessel's contributions). 

The general thrust of source-oriented typology is thus that it can provide an 
immediate, or "first-level" (Creissels 2008: 1), explanation for synchronic distri- 
butions, and Haspelmath proposes in his lead article that whenever immediate 
explanations in terms of source constructions are feasible, they should actually 
be preferred to functional-adaptive ones, following a logic similar to Occam's 
Razor: Immediate explanations are less "costly", in that no independent evidence 
for alleged functional principles, and their interaction with possibly overriding 
forces, has to be adduced. Cristofaro (2014), for example, suggests that a language 
without any case markers for core arguments may come to develop a specific 
A-marker by processes of grammaticalization (e.g. instrumental to ergative re- 
analysis); the direct result of this development is a case system that retains zero 
marking for S and P. Typologically, such a system thus patterns with others in 
which one core argument of transitive clauses, as well the S argument of intran- 
sitive clauses, remains unmarked. But since the diachronic facts give us this con- 
stellation "for free", in Bybee's (2010: 111) words, it would be costlier to summon 
additional functional-adaptive forces to explain the pattern: If one assumes that 
the above scenario is representative of how ergative case systems arise in gen- 
eral, there is no need to appeal to overarching discourse-pragmatic similarities 


!For many readers, the term "result-oriented" will conjure up the notion of teleology. But as, 
for example, Keller (1994) and Croft (2000: 64-71) show, there are many functional-adaptive 
changes that can be interpreted non-teleologically: They are made in pursuit of individual com- 
municative goals rather than with the intention of changing the distribution of grammatical 
marking in the language at large. 
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between S and P (Du Bois 1987) or to a general drive for economical alignment 
systems (Comrie 1989). The same logic applies to P-marking systems and, at least 
in some cases, even to split-intransitive systems (Creissels 2008). 

Source-based accounts, then, are simultaneously more and less elegant than 
many functional-adaptive motivations: They are composed of individual, partly 
heterogeneous strands of explanation (Cristofaro, p. 41), which makes them ar- 
guably less elegant than highly general principles like "harmonic alignment" 
(Aissen 2003) or "early immediate constituents" (Hawkins 1994), but also perhaps 
less susceptible to postulating motivations that, in fact, may not apply.” At the 
same time, their immediate and hence uncostly way of accounting for universals 
makes them more economical than functional-adaptive explanations. 

But there is a critical issue with accepting such low-cost explanations, which is 
epistemological in nature: While we have robust evidence for synchronic states — 
and typological research has produced a number of sophisticated tools for isolat- 
ing truly universal tendencies in those states from contingencies like geographi- 
cal and genealogical relatedness (e.g. Bickel 2013; forthcoming; Jaeger et al. 2011) 
- we do not have comparable world-wide evidence for diachronic trajectories and 
diachronic sources. It is thus inevitable that the data provided by source-oriented 
typologists are highly biased (to certain well-documented or convincingly recon- 
structed families), sketchy and, as Creissels (2008: 3) readily admits, often "largely 
speculative". As long as we have no way of knowing whether the historical sce- 
narios postulated for a particular domain are typical of that domain or, indeed, 
exhaust the possibilities by which a given synchronic pattern can arise in that 
domain, we cannot be sure that the sources provide the best explanation for the 
synchronic patterns.* 


?Note that Creissels uses the terms "direct" and "indirect" explanation for what is here called 
result-oriented and source-oriented, respectively. For Creissels, source-oriented accounts are 
less direct than the functional motivations commonly invoked in typology, but according to 
Haspelmath's “cost scale" of explanation, it makes sense to say that source-based accounts 
are actually more direct than result-oriented approaches, as the former lead us directly from 
the source construction to the present distributions. This is why we called them “immediate” 
explanations above. 

“An example from the present volume is Diessel's critique of (specific aspects of) Hawkins's 
processing account of the position of subordinating morphemes. Diessel argues that there is a 
plausible source-based explanation for these patterns that simply does not need to appeal to 
overarching processing principles. 

^In Dryer's contribution, this point is made with regard to nominalizations being the (alleged) 
major origin of the V-O & N-Gen correlation: He argues (against Collins) that the diachronic 
evidence for this scenario is currently too scanty and speculative to be accepted as a valid 
explanation, and that there are *many [other] ways" (Dryer, p. 83) in which this correlation 
can come about diachronically. 
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This epistemological problem is the basis for Haspelmath’s distinction between 
"recurrent patterns of change” and the much stronger “mutational constraints”; 
crucially, he claims that it is only the latter that can constitute a genuine explana- 
tion for language universals. The word-order correlations described in Collins's 
(and in the main part of Dryer's) paper may well be due to such mutational con- 
straints: Even in the absence of more balanced historical evidence, the pieces 
of the diachronic puzzle that we do have suggest that there are very strong re- 
strictions on the possible sources of auxiliaries and adpositions. The fact that 
the position of these categories correlates with that of their sources may thus 
be sufficiently explained by diachronic persistence effects? But in many other 
cases, especially those involving coding contrasts rather than pairs of linear or- 
der, things are not as clear-cut: The pathways we know of are usually more di- 
verse, so the many cases for which we do not have historical data may just as 
well result from yet other sources and trajectories. And yet the synchronic states 
they yield are strikingly more uniform than expected by chance, and it is pre- 
cisely for such situations that Haspelmath finds the costlier functional-adaptive 
motivations appropriate. 

While the distinction between "recurrent patterns of change" and “mutational 
constraints" may be very difficult to maintain in practice, it highlights that alter- 
native terms like “diachronic approach” or “diachronic explanation" are actually 
misleading: Both source- and result-oriented approaches in typology rely fun- 
damentally on diachronic processes, as even result-oriented motivations must 
somehow be implemented in the developments that produce the synchronic pat- 
tern (Haspelmath 1999). Haspelmath's terminological proposal thus helps to clar- 
ify the essence of the contrast, which lies precisely in the theoretical status at- 
tributed to diachronic processes in the two approaches. 

At the same time, the opposition of “mutational” and “functional-adaptive” 
constraints makes one wonder whether the former are not also ultimately driven 
by functional motivations. We will return to this question in more detail in 82 
again, but let us still ask at this point how supporters of source-oriented explana- 
tions motivate the kinds of diachronic developments they discuss: They clearly ar- 
gue against functional-adaptive, i.e. result-oriented, motivations in Haspelmath's 
sense, but they neither claim that diachronic processes are entirely accidental, 


“Though see, e.g., Harris & Campbell (1995: 210-215) for important qualifications of this view: 
It is not always the case that the ordering pattern from the source construction is actually 
retained in the target construction and, conversely, correlating orders may come about by 
processes other than reanalysis. We will return to this latter point below. 

“This even holds for Collins's contribution, who claims that “some universals are historical 
accidents" (p. 47): He uses the term "accidental" as an antonym of “functional-adaptive”, but 
not as a synonym of "random" or "chaotic". 
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nor that they are triggered by innate "representational constraints" (in Haspel- 
math's terms)." It seems to us that source-oriented explanations chiefly make ref- 
erence to online processes of categorization and inferencing (Bybee 2010), which 
can lead to the reanalysis of the form-function mappings in a given surface string 
(Croft 2000; De Smet 2009). It is in this way that new grammatical markers (e.g. 
case or number morphemes) as well as new syntactic structures (e.g. Aux-V con- 
structions) "naturally" emerge from pre-existing material, in all languages. To 
be sure, these diachronic processes work by applying domain-general cognitive 
mechanisms to communicative interactions, so they are ultimately motivated in 
terms of these mechanisms. Crucially, however, the reanalysis itself is not adap- 
tive in nature: It does not happen in order to produce a more efficient coding 
strategy for, say, number or case, or to disambiguate the core arguments of tran- 
sitive clauses. This is one of Cristofaro's major arguments for rejecting functional- 
adaptive motivations for case and number marking (and similar grammatical cat- 
egories). If anything, then, the major mechanism of grammatical innovation in 
source-based accounts, i.e. reanalysis, is itself motivated by whatever perceptual, 
cognitive or communicative factors invite a reanalysis in the first place.? 

But a critical question is whether reanalysis is really the only way in which 
existing morphemes and constructions can give rise to grammatical innovations. 
Result-oriented approaches allow for the possibility that the agents of such in- 
novations are not exclusively listeners (as in reanalysis), but also speakers. In 
particular, speakers may re-functionalize existing material in precisely those con- 
texts where additional marking is felt to be beneficial for information processing. 
Some of the diachronic scenarios proposed in Michaelis' paper appear to rely on 
this mechanism: She argues that when possessive person forms (e.g. your) are 
employed to express a relatively unusual function, namely reference (yours) in- 
stead of attribution, speakers summon additional marking to signal this devia- 
tion (see also Croft 1991; 2001 for a similar proposal). Zeevat & Jäger (2002) refer 
to this functional-adaptive recruitment of marking as “annexation” or “seman- 
tic epenthesis", and they use this mechanism to explain, for example, the rise of 


"Because ofthe latter, the debate in the present book - in contrast to earlier volumes on the topic 
(e.g. Hawkins 1988; Good 2008) - takes place entirely within the non-generative, i.e. usage- 
based, camp. 

"There are a number of interesting proposals as to the kinds of context that facilitate or even 
induce reanalysis (e.g. Detges & Waltereit 2002; Hansen & Waltereit 2006; Rosemeyer & Gross- 
man 2017; Schwenter & Waltereit 2009; Traugott & Dasher 2002). To take but one recent 
example, Eckardt (2009) argues that listeners typically reanalyze utterances in situations of 
“pragmatic overload", e.g. utterances whose presuppositions are not easily accommodated. In 
addition, it has also been suggested that reanalysis is sanctioned by the cognitive mechanism 
of analogy, in that reanalysis is often contingent on a model, or supporting construction, in 
the language in question (see Fischer 2011; De Smet & Fischer 2017). 
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differential argument marking (e.g. the flagging of objects with statistically un- 
expected referential properties). Whether or not one finds these particular exam- 
ples convincing, one could - in principle — imagine that some of the diachronic 
sources of plural marking that Cristofaro discusses are not due to reanalysis ei- 
ther, but go back further, namely to a speaker's insertion of a lexical marker like 
‘all’, ‘people’, etc. in contexts where plurality is relatively unexpected or in need 
of disambiguation (as in Present-Day English you all, you guys, etc.); it is only 
afterwards that such markers get reanalyzed and grammaticalized as plural mor- 
phemes, but their ultimate origin may have been a pattern of functional-adaptive 
“annexation”.” Therefore, although we will never be able to replay the innovation 
of highly grammaticalized markers, we should not exclude a priori the possibility 
that it is driven by processes other than reanalysis. 

An immediately related issue is that source-based typologies also tend to be 
too narrow on another plane: When Cristofaro pleads to "take diachronic ev- 
idence seriously" (p. 25), one wonders why her arguments against particular 
functional motivations revolve entirely around the innovation stage of gram- 
matical change, to the exclusion of further developments. The contributions by 
Schmidtke-Bode and by Serzant highlight the importance of diffusion processes, 
ie. the gradual extension of innovations to new environments, and particularly 
also stages at which the use of a grammatical marker is (still) variable. There is 
ample evidence from corpus data, grammatical descriptions, psycholinguistic ex- 
perimentation and, as Levshina's paper shows, from artificial language learning, 
that a significant part of such variability is driven by functional-adaptive motiva- 
tions. Therefore, we believe that a more appropriate way of taking diachronic ev- 
idence seriously would be to return to Bybee's (1988) original formulation: Bybee 
argues that, for functional-adaptive explanations to be valid, "it must be shown 
that the factor appealed to as explanation actually contributes to the creation 
of the particular grammatical convention" (Bybee 1988: 357). As the creation of 
a grammatical convention goes well beyond the processes and motivations by 
which a pattern first arose, the absence of evidence for functional-adaptive moti- 


?In Croft's (2000) systematization of grammatical innovations, a mechanism very similar to an- 
nexation is actually described as a particular kind of reanalysis: In his so-called “cryptanalysis”, 
speakers feel that a conventionalized construction does not code an intended meaning com- 
ponent sufficiently and thus add appropriate material (e.g. double plurals like English feet-s or 
Uzbek bi-z-lar *we-PL-Pr', see also Koch 1995 for further examples from different grammatical 
domains). However, all of Croft's examples differ from the above cases in that they already 
contain an overt grammatical marker that is analyzed as not being present (typically, it seems, 
as a result of chunking and entrenchment (Bybee 2015: 102ff.)). The notion of annexation, by 
contrast, is intended to capture the first kind of grammatical marking that arises (e.g. an ac- 
cusative case marker on formerly bare object NPs). It is thus not a kind of contextual reanalysis 
of previously existing material, but the recruitment of a marker from another domain. 
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vations at the innovation stage does not provide evidence for the absence of such 
motivations in the development of the grammatical pattern in question. 

This leads us (back) to the nature of the evidence that is brought to bear on 
the present discussion. Haspelmath adopts the strong position that "diachronic 
evidence is not strictly speaking necessary" (p. 16) to explain a typological regu- 
larity in functional-adaptive terms. This radical position appears to stem, at least 
in part, from the observation that where we do have pieces of diachronic stories, 
it often seems to be the case that "all diachronic roads lead to the same synchronic 
Rome" (Kiparsky 2008: 38). In other words, one and the same typological state 
can arise in manifold ways, which Haspelmath (just like Kiparsky) takes as evi- 
dence for convergent evolution towards a common attractor state. This is perhaps 
one ofthe most interesting aspects ofthe present debate, as exactly the same type 
of evidence (i.e. the available or reconstructed historical data) is interpreted in op- 
posite ways. Michaelis’ contribution endorses the Haspelmath-Kiparsky stance: 
the coding asymmetry between attributive and referential possessive forms is 
sometimes due to phonetic reduction processes in the more predictable (i.e. the 
attributive) function, and often due to the annexation and grammaticalization of 
more coding material in the less predictable (i.e. the referential) function, and 
again from a variety of different sources. Cristofaro, by contrast, argues (for 
number marking) that the phonetic erosion of overt singulars may have various 
language-internal motivations, and that the sheer variety of different sources for 
the grammaticalization of new plural markers simply does not point towards 
a single, unifying force. There is no obvious way to settle this issue, given the 
above-mentioned quantity and depth of resolution of actual diachronic data. This 
is exactly why proponents of functional-adaptive motivations have long sought 
to triangulate typological data with behavioural evidence from other sources. 

Within the confines ofthe present volume, we have not been able to represent 
most of these other data sources, but the contributions by SerZant and by Lev- 
shina do highlight the potential of analyzing performance data from historical 
transition phases and from artificial language learning, respectively. The latter 
is a relatively novel experimental paradigm that, despite certain drawbacks (e.g. 
potential L1 influence), provides a useful addition to classic psycholinguistic, neu- 
rolinguistic and simulative experimentation on typologically relevant phenom- 
ena (see, e.g. Kurumada & Jaeger 2015; Bickel et al. 2015; Lestrade 2018 for these 
different types of data on typological preferences in case marking). All of these 
performance data point to the existence of functional-adaptive forces in gram- 
mar and hence cannot be neglected in the study of typological patterns. In fact, 
the recent movement in usage-based linguistics to view language as an instance 
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of a "complex adaptive system" (e.g. Gell-Mann 1992; Beckner et al. 2009) sug- 
gests that grammatical structure is shaped over time to adapt to interlocutors' 
(partially conflicting) needs. 

At the same time, another important property of such complex adaptive sys- 
tems is that their developmental trajectories crucially depend on the system's ini- 
tial conditions - i.e. precisely on the nature of each "source construction". This 
ties in nicely with Cristofaro's (p. 41) argument that a source-based approach 
to universals is not only able to capture the cross-linguistic commonalities (be- 
cause there are, after all, strong preferences in the kinds of grammaticalization 
processes that happen across languages) but also the exceptions: The latter, she 
argues, also fall out directly from the initial conditions, in that the languages 
with exceptional patterns may have different source constructions, or even no 
sources of the relevant type. !? 

While this is an attractive (again: "low-cost") proposal, we feel that it needs to 
be made more rigorous. At present, source-oriented accounts are often selective 
in their interpretation of the data. For example, when Cristofaro (p. 28) claims 
that ergatives do not apply to first- and second-person pronouns because their 
instrumental source tended not to do so either (for obvious semantic reasons), 
one wonders why the same kind of restriction does not carry over to similar cases. 
For instance, Kiparsky (2008: 36) mentions, among quite a few other examples, 
that when ablatives develop from a source with separative meaning (away from 
X’), these sources are often limited to animates and physical objects, “and yet 
we don't find ablatives with zero allomorphs on abstract nouns". In other words, 
purely source-based accounts are sometimes too general, as they predict all kinds 
of restrictions that get levelled as diachrony unfolds. Conversely, they can also 
be too limited because, as noted by Kiparsky (2008) as well, some synchronic 
patterns of differential marking are rather difficult to derive from their sources 
(e.g. animacy restrictions on the Genitive in Yukaghir). While Kiparsky's critique 
was directed chiefly against Garrett (1990), it seems to us that the same argument 
applies to more recent incarnations of source-oriented explanations. 

This observation rounds off our survey ofthe arguments laid out in the volume, 
and we now proceed to some implications from and possible future directions for 
the debate as a whole. 


This argument is also supported by Diessel's contribution: He shows (p. 115) that instances 
of postposed adverbial clauses with final subordinators (his going-to-the-movies because") 
are unexpected from a processing perspective, but receive a natural explanation in diachronic 
terms if one realizes that these structures exist in OV languages which place the source con- 
struction, i.e. oblique PPs, in postverbal rather than preverbal position. 


230 


10 Diachronic sources, functional motivations and the nature of the evidence 


2 Lessons, challenges and future directions 


One important general lesson from Haspelmath's lead article is that the very no- 
tion of “diachronic explanation" in typology is too vague, as both source- and 
result-oriented explanations crucially involve diachronic processes, but in differ- 
ent ways (see $1 above). Furthermore, it is necessary to specify the requirements, 
as Haspelmath does, on when a so-called source-oriented account of universals 
is considered a genuine explanation, which is precisely what a terminological 
proposal like “mutational explanation" is meant to capture. But it is less clear 
to us whether such a mutational explanation is best described as a "constraint" 
on language, and whether it is as such felicitously juxtaposed with "functional- 
adaptive", "acquisitional" and “representational” constraints (Haspelmath, p. 7). 
The primary reason for this uncertainty is that “mutational constraints", unlike 
the others proposed by Haspelmath, do not have a clear locus, as it were. The 
changes they are intended to capture are themselves rooted in forces operating 
on language users (and thus ultimately on language systems) and these could, 
in principle, be functional-adaptive constraints, constraints on learning or con- 
straints on change from innate linguistic representations, and possibly others. In 
other words, “mutational constraints" are always due to something else, some- 
thing that really constrains the mutations (as Haspelmath notes himself (p. 9, fn. 
7)), and so they do not, strictly speaking, form a paradigmatic opposition with 
these other constraints. 

To give just one example, Haspelmath argues that the universal generalization 
that all languages with nasal vowels also have nasal consonants can be accounted 
for directly by the restricted ways in which nasal vowels come about, i.e. most 
typically by regressive assimilation to a following nasal consonant. But this mu- 
tational constraint is motivated, at least in part, by processes that one may well 
describe as "functional", viz. the anticipation and consequent retiming of articula- 
tory gestures that come with repeated exposure and practice ofthe VC sequences 
in question (Bybee 2015: 38; but cf. Ohala 1989; 2003 for alternative explanations). 
Is this "functional" in the sense of "functional-adaptive" (because it results in eas- 
ier or more economical articulation for the speaker) or in the sense of being a 
natural consequence of frequency-based memory representations and our predic- 
tions based on those frequencies? In the latter case, could this possibly count as a 
"representational constraint"? According to Haspelmath, it cannot, because the 
representational constraints he envisages are innate linguistic representations (= 
"costly" stipulations of structure that cannot be explained in more direct terms). 
But then it is unclear how to classify the frequency effects from exemplar repre- 
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sentations that are not straightforwardly adaptive in nature, such as some of the 
"conserving effects of token frequency" discussed in many places by Bybee and 
others (e.g. Bybee & Thompson 1997; Pierrehumbert 2001; Bybee 2001; see also 
Cristofaro 2015 for discussion): Well-entrenched representations are more resis- 
tant to change, but that does not necessarily mean that they make speaker-hearer 
interactions (or a linguistic system) more efficient or otherwise better adapted." 
Similar remarks apply to the pragmatic processes that constrain reanalysis (see 
fn. 8 above): These are quite systematic and thus principled constraints on the 
development of languages, yet they are not adaptive according to Haspelmath's 
definition and, therefore, defy classification according to his schema. 

A related difficulty pertains to the separation of representational and acqui- 
sitional constraints (see also Levshina's paper). In generative approaches, in- 
nate representations are invoked precisely in order to solve learnability prob- 
lems, making it hard to draw the line between acquisition and representation. In 
non-generative approaches, by contrast, a constraint on acquisition only makes 
sense if it can be disentangled from functional-adaptive constraints (cf. "What 
is functional is learnt best"). And crucially, in both approaches it would have 
to be shown that processes of language learning are causally involved in the di- 
achronic development of languages. While much scepticism has been voiced that 
(monolingual) L1 acquisition should be causally related to language change (see 
e.g., Croft 2000; Heine & Kuteva 2007; Diessel 2011 for recent surveys), it is very 
likely that certain forms of bilingualism and L2 acquisition play such a causal role, 
namely in situations of language contact (Matras 2009; Meisel et al. 2013; Gast 
2017 and many others). But then again, the difficulty remains of separating the 
acquisition processes from either representational or functional-adaptive forces 
involved in them.” 


"Incidentally, Diessel’s contribution also distinguishes between "functional" and “cognitive” mo- 
tivations for the development of preposed adverbial clauses (p. 115). The former relate, for ex- 
ample, to information structure and iconicity, and are adaptive in Haspelmath's sense. The 
latter refer to the cognitive processes involved in the grammaticalization of adverbial clauses, 
notably “automatization, semantic bleaching and formal reduction" (p. 112). Apparently, then, 
these specific effects of cognitive representation are to be kept distinct from "functional" fac- 
tors, which shows that they do not fit easily into Haspelmath's typology of constraints. 

More generally, the issue of language contact has received rather little attention in this volume 
(apart from Michaelis' contribution on contact languages, of course). Needless to say, we do not 
wish to marginalize the role of contact for diachronic development. But for one thing, the en- 
tire discussion revolves around the notion of universals, which are usually seen as “distilled” 
properties of linguistic structure after contact-induced similarities are controlled for (Bickel 
2011). Secondly, as argued above, what happens in contact situations deserves its own detailed 
investigation in order to clearly disentangle different types of forces on diachronic develop- 
ment in these contexts. This may reveal similar pressures on development as in non-contact 
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It is beyond the scope of this short epilogue to elaborate on these important 
issues. The critical point is simply that Haspelmath's constraint typology must 
be interpreted carefully for what it is, namely a typology of different types of 
explanation: If we neglect the more elusive acquisitional constraints for now, the 
three remaining “mutational”, “functional-adaptive” and “(UG-)representational” 
approaches indeed constitute three very different practices of explaining how 
universals in language develop. And as such, they can plausibly be ranked on 
a cost scale which characterizes the number of explanatory principles beyond 
those that are necessary to explain the origin of each individual construction in 
one’s sample (mutational > functional-adaptive > UG-representational). But this 
does not exhaust what we may wish to call constraints on language, i.e. the sum 
of all pressures or forces that “cause languages to change in preferred or ‘natural’ 
ways’ (Bickel et al. 2015: 29). The different types of explanation rather highlight 
that either there is or there is not more to the motivation of language universals 
than the persisting properties of individual source constructions.” 

Our own view is that persistence effects from source constructions are one of 
many forces which constrain the development of linguistic structure and thus 
have a role to play in the explanation of universals pertaining to these struc- 
tures. But as laid out in §1 above, they are rarely ever the whole story. Serzant’s 
contribution to the present volume is particularly insightful in this regard, as he 
shows that the respective sources of individual differential object markers exert a 
strong influence on the current use of these markers, but that functional-adaptive 
considerations of efficient information processing (particularly ambiguity avoid- 
ance) interact with the source meaning at particular historical stages, and can 
even pave the way for the further development of the marker in question. So, 
just as we argued above, each synchronic state of a complex adaptive system de- 
pends to some extent on its initial conditions; but it is adaptive nevertheless. As 


languages (as argued by Michaelis) or else point to the overriding importance of other, more 
contact-specific, factors (e.g. patterns of L2 learning, constraints on borrowing, etc.). See also 
Matras (2009) for a book-length survey of these issues, and Hickey (2017) for a state-of-the-art 
collection on areal linguistics. 

Ultimately, such a typology of constraints on language would have to accommodate, for exam- 
ple, environmental factors (humidity, altitude), socio-cultural factors (population size, socio- 
cultural practices, social goals in communication), communicative/pragmatic factors (biases in 
inference making and the resulting utterance interpretation) and cognitive factors, with the 
latter to be worked out more specifically in terms of whether or not they are domain-general 
abilities or domain-specific biases and to what extent they are innate or learned acquisition. 
Some recent systematizations include Christiansen & Chater (2008), Evans & Levinson (2009) 
and Bybee (2010), but none of them addresses all of the above dimensions (or claims to be 
exhaustive). 
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Shibatani (2006: 263) puts it, a language should be seen “asa historically-evolving 
functional organism sustaining constant pressure for adaptation". 

Therefore, while the present volume sought to encourage a lively debate be- 
tween, and hence often a rigid juxtaposition of, source- and result-oriented ex- 
planations, it seems likely that most typological phenomena will need a nuanced 
mixture of both (see also Dryer's and Diessel's contributions to this volume). In 
fact, this echoes an assessment given by Nichols (2008: 287-288): 


Rather than synchronic patterns always being the goal and driving force 
of language change, various synchronic patterns are the predictable conse- 
quences of diachronic processes which have their own logic independent of 
the synchrony they produce. Thus, to a greater extent than [one might pre- 
sume], synchronic structural patterns are epiphenomenal. But they are not 
entirely so. Economies of various kinds appear to be targets of change [...], 
and there appear to be [...] structural patterns that may be goals of change 
but are not its accidental results. 


The methodological challenge ahead is thus to calibrate more precisely, for 
each grammatical domain and typological generalization at a time, how much 
room for functional-adaptive motivations is left once one controls for persistence 
effects in the data as much as possible. As Collins (p. 47) reminds us, constraints 
inherited from the source add yet another kind of dependency (on top of areal 
and genealogical relations) to typological samples. It then becomes an empiri- 
cal question whether the sources are so clearly circumscribed that they, indeed, 
suggest a mutational explanation and give us the synchronic distributions for 
free, or whether it is necessary to resort to costlier explanations along functional- 
adaptive lines that go beyond the individual sources. 

Another avenue for conceptual work on universals and diachrony would be 
to expand and flesh out a framework developed by Greenberg (1978), Nichols 
(1992; 2003) and Bickel (2013). This framework lays the conceptual foundations 
for modelling probabilities of cross-linguistic unity and diversity in diachronic 
terms. For example, according to Greenberg (1978: 76), a particular linguistic phe- 
nomenon should be universal or near-universal "if it can arise very frequently 
and is highly stable once it occurs. [... ] If a particular property rarely arises but 
is highly stable when it occurs, it should be fairly frequent on a global basis but 
be largely confined to a few linguistic stocks.” Nichols (2008: 287-288) further 
develops such predictions by bringing contact-induced phenomena (borrowing 
and substrate influence) and functional-adaptive factors ( "harmony", “unmarked- 
ness”) into the equation. Elaborating on these lines of thinking, one may look at 
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some of the themes of the present volume in the following way (see Grossman 
2016 and Grossman et al. 2018 for more details): 


(1) Types of diachronic influence on language universals 


a. SOURCE: frequency of the source construction (Cristofaro 2019 [this 
volume]) 

b. TYPE: frequency of type of change (e.g. assimilatory changes are more 
common than dissimilatory ones). This has rarely been studied on the 
basis of large samples and for a wider range of phenomena (largely 
due to the epistemological problems discussed in $1 above), thus 
constituting a desideratum in typological research. 

c. PATH: number of pathways that lead to a particular item type (e.g. 
Bybee et al. 1994 on tense-aspect-mood constructions) 


d. STAGE: number of stages necessary to yield a certain outcome (e.g. 
Harris 2008) 


e. STABILITY: inherent stability of item type (Greenberg 1978, Nichols 
2003) 


f. DIFFUSABILITY: likelihood to diffuse through contact (borrowing, 
calquing, contact-induced grammaticalization) 


Note that (a), (b), (e) and (f) may themselves be causally related to functional- 
adaptive forces. For example, a given phenomenon may be faithfully inherited 
and hence be diachronically stable precisely because it is adaptive in Haspel- 
math's sense; and it may easily diffuse in language contact for the same reason 
(see also Bickel 2013; 2017 for the same observations). Just as in Greenberg (1978), 
then, the basic idea is that the more these factors converge, the stronger the 
cross-linguistic preponderance of the structure in question. In other words, if a 
property develops from cross-linguistically frequent sources, as a result of com- 
mon types of change that involve few stages, if there are multiple pathways that 
lead to it, it is stable once present, and it is likely to diffuse through borrow- 
ing, this property is predicted to be (nearly) universal. Conversely, a property 
that involves rare source constructions, rare changes, and so on, is predicted to 
be cross-linguistically rare or limited areally and/or phylogenetically. Of course, 
these factors might have varying strengths, and it is a goal of typological re- 
search to determine their relative ranking for each case in question. For example, 
it may be that a property develops often, from multiple and common sources, but 
if it is inherently unstable — say, due to a strong functional-adaptive pressure to 
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eliminate it - then it is predicted to be sporadically attested areally and genealog- 
ically. If it is diffusable, then it has a good chance to take root in particular ar- 
eas. In phonology, for example, this seems to be the case for aspirated fricatives 
(Jacques 2011) and for affricate-rich systems (Nikolaev & Grossman 2018), the lat- 
ter of which are diachronically unstable unless supported areally. As far as we 
are aware, Bickel's (2011; 2013) Family Bias Method has offered the first princi- 
pled way of incorporating some of these considerations into actual typological 
methodology (see Schmidtke-Bode's paper for an application). It is to be hoped 
that such methods, alongside more classic sampling methods with built-in con- 
trols for source-related dependencies (as suggested above), will become de rigeur 
in future typological research. 

Above all, we hope that the present volume has offered an insight into cur- 
rent ways of thinking about the role of diachronic processes in explaining uni- 
versal generalizations, and that it has contributed to specifying the arguments, 
strengths and weaknesses of different positions in that debate. 
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Explanation in typology 


This volume provides an up-to-date discussion of a foundational issue that has recently 
taken centre stage in linguistic typology and which is relevant to the language sciences 
more generally: To what extent can cross-linguistic generalizations, i.e. statistical univer- 
sals of linguistic structure, be explained by the diachronic sources of these structures? 
Everyone agrees that typological distributions are the result of complex histories, as "lan- 
guages evolve into the variation states to which synchronic universals pertain" (Haw- 
kins 1988). However, an increasingly popular line of argumentation holds that many, 
perhaps most, typological regularities are long-term reflections of their diachronic sour- 
ces, rather than being 'target-driven' by overarching functional-adaptive motivations. 
On this view, recurrent pathways of reanalysis and grammaticalization can lead to uni- 
form synchronic results, obviating the need to postulate global forces like ambiguity 
avoidance, processing efficiency or iconicity, especially if there is no evidence for such 
motivations in the genesis of the respective constructions. On the other hand, the recent 
typological literature is equally ripe with talk of ‘complex adaptive systems’, ‘attractor 
states’ and 'cross-linguistic convergence’. One may wonder, therefore, how much room 
is left for traditional functional-adaptive forces and how exactly they influence the di- 
achronic trajectories that shape universal distributions. The papers in the present volume 
are intended to provide an accessible introduction to this debate. Covering theoretical, 
methodological and empirical facets of the issue at hand, they represent current ways 
of thinking about the role of diachronic sources in explaining grammatical universals, 
articulated by seasoned and budding linguists alike. 
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